The R language (i) is widely used by physical, chemical, and biological researchers and (ii) comes with lots of interesting and easily loaded data sets. Unfortunately and surprisingly, a data set of commonly encountered physical constants is not one of them. If you’re writing R code to transform data based on physical equations, chances are these equations will involve physical constants such as Planck’s constant or the Newtonian gravitational constant.

Examples of data bundled in R include `mtcars`

and other toy data that pop up in examples and tutorials everywhere. They can be loaded via commands like `data(mtcars)`

. It seems natural to have a data set for physical constants that could be loaded similarly, but as far as I know there isn’t one. And it isn’t just me that’s been wondering; the issue has come up on StackOverflow more than three years ago.

The best source for values of physical constants in NIST, whose Committee on Data for Science & Technology (CODATA) routinely aggregates data from experiments which measure a constant and adjusts all the aggregated values so the overall data set is self consistent. As of today it seems like the most recent update was from 2010. NIST data for the physical constants is browsable on their web page; a flat ASCII file is here.

Here’s some R code for transforming this ASCII table into an R data frame.

```
#parsing NIST CODATA data for physical constants
require(stringr) #simplifies regex usage syntax in R
#web page with data
link <- 'http://physics.nist.gov/cuu/Constants/Table/allascii.txt'
allstr <- readLines('link')
codata.str <- allstr[11:345] #data starts at line 11
#at least 3 spaces separate columns
codata.str2 <- str_replace_all(codata.str, '[ ]{3,100}', '\t')
#four cols in source table
codata.mat <- str_split_fixed(codata.str2, '\t', n=4)
#eliminate spaces separating every three decimal digits
codata.mat[,c(2,3)] <- str_replace_all(codata.mat[,c(2,3)], " ", "")
codata <- data.frame(quantity = codata.mat[,1],
value = as.numeric(codata.mat[,2]),
uncertainty = as.numeric(codata.mat[,3]),
units = codata.mat[,4]
)
```

And that’s it. Now it’s easy to search — in R — the data to find the constants you want, and assign constants that you will use to variable names, without having to manually type them.

```
> codata[str_detect(codata$quantity, 'electron mass'),]
quantity value uncertainty units
2 alpha particle-electron mass ratio 7.294300e+03 2.9e-06
63 deuteron-electron mass ratio 3.670483e+03 1.5e-06
89 electron mass 9.109383e-31 4.0e-38 kg
90 electron mass energy equivalent 8.187105e-14 3.6e-21 J
91 electron mass energy equivalent in MeV 5.109989e-01 1.1e-08 MeV
92 electron mass in u 5.485799e-04 2.2e-13 u
130 helion-electron mass ratio 5.495885e+03 5.0e-06
195 muon-electron mass ratio 2.067683e+02 5.2e-06
223 neutron-electron mass ratio 1.838684e+03 1.1e-06
264 proton-electron mass ratio 1.836153e+03 7.5e-07
310 tau-electron mass ratio 3.477150e+03 3.1e-01
320 triton-electron mass ratio 5.496922e+03 5.0e-06
```

```
> me.u < codata[str_detect(codata$quantity, 'electron mass in u'),'value']
> me.u
[1] 0.0005485799
```

I like the dataframe format because it’s what I (and I think many other R users) am used to, but there are other useful solutions at StackOverflow.

I wonder how to get the core R development team to include this listing or a similar one in base R for easy loading via a command like `data(codata)`