my_val <- (c(0.1, 0.00000000123, 13/150660001, 761231, -3.243) )
my_val[1] 1.00000e-01 1.23000e-09 8.62870e-08 7.61231e+05
[5] -3.24300e+00
Notation and tables
MADS6
Tuesday, 10 December 2024
Exact numbers
Measured numbers
Note
In data science, we are frequently facing values with limited detail on their measurements. That’s not an invitation to ignore the fact.
Note
In text, we would write “three bikes and seven people”.
1km = 1000m
299 792 458 m/s (a definition)
Multiple measurements will return differing results
Measurements by different people will return different results.
Precision is naturally limited
Why we care
In data science we typically do not measure data ourselves.
The value we print to a report however conveys a meaning of certainty.
Leading zeros
| ||||||
0.000000230453400
| ||
Captive zero Trailing zeros
| Value | Digits | Explanation |
|---|---|---|
| 1642 m | 4 | All non-zero |
| 10.303 ml | 5 | Captive zeros |
| 67.0 g | 3 | Trailing measured zero |
| 0.00053503 | 5 | All leading zeros irrelevant |
| Value | Digits | Explanation |
|---|---|---|
| 1200001 m | 7 | Mostly captive zeros |
| 12.000 g | 5 | If measured |
| 0.0000000001 | 1 | All leading zeros irrelevant |
| 0.0010000001 | 8 |
Many numbers in data science are not in our “nice” range from 0 - 1000.
Should be \(1.9 \times 10^{-4}\) but we usually get to see 1.9e-4.
$1.9 \times 10^{-4}$
[1] "1.000000e-01" "1.230000e-09" "8.628700e-08"
[4] "7.612310e+05" "-3.243000e+00"
Exponents are a multiple of three.
Using the pillar::num()
sprintf()%i - Integer values%f - Decimal numerical format (fixed)%e - Scientific notation %E with capital E%g - Best of both worlds: decimal if exponent is < -4.Many other options for padding, currency symbols
[1] "0.100000" "0.000000" "0.000000"
[4] "761231.000000" "-3.243000"
[1] "1.000000e-01" "1.230000e-09" "8.628700e-08"
[4] "7.612310e+05" "-3.243000e+00"
[1] "0.1" "1.23E-09" "8.6287E-08" "761231"
[5] "-3.243"
Error in sprintf("%i", my_val): invalid format '%i'; use format %f, %e, %g or %a for numeric objects
Python
Modern Python is using sprintf-style syntax in its str.format() function. Using the sprintf functions is deprecated.
Check the documentation.
Compare different numbers - Eight, 12.0, 1.76-e7, four, 3.0
Adding numbers need to be rounded to the least precise digit
[1] 11.961
[1] 12
Least precise value (4.5) carries a single digit.
Base R is doing this nicely but the result is not publication quality.
# A tibble: 344 × 4
species island bill_length_mm bill_depth_mm
<fct> <fct> <dbl> <dbl>
1 Adelie Dream 32.1 15.5
2 Adelie Dream 33.1 16.1
3 Adelie Torgersen 33.5 19
4 Adelie Dream 34 17.1
5 Adelie Torgersen 34.1 18.1
6 Adelie Torgersen 34.4 18.4
7 Adelie Biscoe 34.5 18.1
8 Adelie Torgersen 34.6 21.1
9 Adelie Torgersen 34.6 17.2
10 Adelie Biscoe 35 17.9
# ℹ 334 more rows
gt| species | island | bill_length_mm | bill_depth_mm |
|---|---|---|---|
| Gentoo | Biscoe | 44.5 | 14.3 |
| Adelie | Torgersen | 38.6 | 21.2 |
| Gentoo | Biscoe | 45.3 | 13.7 |
| Chinstrap | Dream | 52.8 | 20.0 |
| Adelie | Torgersen | 37.3 | 20.5 |
| Chinstrap | Dream | 43.2 | 16.6 |
| Gentoo | Biscoe | 47.5 | 14.2 |
| Gentoo | Biscoe | 52.2 | 17.1 |
| Chinstrap | Dream | 50.8 | 19.0 |
| Gentoo | Biscoe | 46.1 | 13.2 |
gt| Species | Island | Bill length (mm) | Bill depth (mm) |
|---|---|---|---|
| Adelie | Dream | 35.6 | 17.5 |
| Gentoo | Biscoe | 55.9 | 17.0 |
| Gentoo | Biscoe | 43.2 | 14.5 |
| Adelie | Torgersen | 37.2 | 19.4 |
| Adelie | Torgersen | 34.6 | 17.2 |
| Adelie | Dream | 36.0 | 18.5 |
| Chinstrap | Dream | 52.2 | 18.8 |
| Adelie | Dream | 32.1 | 15.5 |
| Adelie | Dream | 37.2 | 18.1 |
| Adelie | Biscoe | 38.1 | 17.0 |
gt() centers factors by default.
gtpenguins |>
select(species, island, contains("bill")) |>
sample_n(10) |>
gt() |>
cols_align(
align = "left",
columns = c(species, island)) |>
cols_label(
species = "Species",
island = "Island",
bill_length_mm = "Length" ,
bill_depth_mm = "Depth"
) |>
tab_spanner("Bill dimensions (mm)", contains("bill"))| Species | Island |
Bill dimensions (mm)
|
|
|---|---|---|---|
| Length | Depth | ||
| Chinstrap | Dream | 52.0 | 18.1 |
| Adelie | Torgersen | 35.9 | 16.6 |
| Adelie | Torgersen | 39.7 | 18.4 |
| Adelie | Torgersen | 42.8 | 18.5 |
| Adelie | Biscoe | 37.6 | 17.0 |
| Adelie | Dream | 40.2 | 17.1 |
| Gentoo | Biscoe | 45.5 | 15.0 |
| Gentoo | Biscoe | 46.1 | 13.2 |
| Chinstrap | Dream | 58.0 | 17.8 |
| Chinstrap | Dream | 46.5 | 17.9 |
gtpenguins |>
select(species, island, contains("bill")) |>
sample_n(10) |>
gt() |>
cols_align(
align = "left",
columns = c(species, island)) |>
cols_label(
species = "Species",
island = "Island",
bill_length_mm = "Length" ,
bill_depth_mm = "Depth"
) |>
tab_spanner("Bill dimensions (mm)", contains("bill")) |>
tab_options(column_labels.background.color = "#00A4E1") | Species | Island |
Bill dimensions (mm)
|
|
|---|---|---|---|
| Length | Depth | ||
| Chinstrap | Dream | 49.6 | 18.2 |
| Adelie | Dream | 37.6 | 19.3 |
| Gentoo | Biscoe | 45.4 | 14.6 |
| Gentoo | Biscoe | 51.3 | 14.2 |
| Gentoo | Biscoe | 43.8 | 13.9 |
| Chinstrap | Dream | 52.8 | 20.0 |
| Adelie | Dream | 40.9 | 18.9 |
| Adelie | Torgersen | NA | NA |
| Adelie | Torgersen | 41.1 | 17.6 |
| Adelie | Torgersen | 35.9 | 16.6 |
gtpenguins |>
select(species, island, contains("bill")) |>
sample_n(10) |>
gt() |>
cols_align(align = "left", columns = c(species, island)) |>
cols_label(
species = "Species",
island = "Island",
bill_length_mm = "Length" ,
bill_depth_mm = "Depth"
) |>
tab_spanner("Bill dimensions (mm)", contains("bill")) |>
tab_options(column_labels.background.color = "#00A4E1") |>
tab_style(
style = cell_text(size = pct(120)),
locations = cells_body()) |>
tab_style(
style = cell_text(weight = "bold"),
locations = list(cells_column_labels(),
cells_column_spanners()))| Species | Island |
Bill dimensions (mm)
|
|
|---|---|---|---|
| Length | Depth | ||
| Chinstrap | Dream | 51.3 | 19.9 |
| Gentoo | Biscoe | 45.4 | 14.6 |
| Adelie | Torgersen | 36.2 | 17.2 |
| Adelie | Biscoe | 43.2 | 19.0 |
| Gentoo | Biscoe | 46.5 | 13.5 |
| Adelie | Biscoe | 41.1 | 18.2 |
| Gentoo | Biscoe | 45.5 | 13.9 |
| Adelie | Biscoe | 35.5 | 16.2 |
| Adelie | Dream | 36.0 | 17.1 |
| Chinstrap | Dream | 50.2 | 18.8 |