eeptools 1.3.0 is out on CRAN today, and it ships no new features.
That is deliberate: I'm putting eeptools officially into
long-term support. For quite awhile the package has been dangling unattended to and
only updated minimally to keep it on CRAN. My intention moving forward is to shrink the package
to what it uniquely contributes, keep it healthy and compatible. Since it was created, base R,
readr, dplyr, and janitor now do many of its functions better than it ever did. What's left is only the handful of tools that are still genuinely unique and useful for
education data which I will continue to maintain!
install.packages("eeptools")A decade of accumulation #
eeptools — Education Evaluation and Policy tools — has been on CRAN since
September 2012, the renamed successor to a package I started in grad school
called LDS_TOOLS. In the years since it has been downloaded more than
264,000 times, and still pulls roughly 2,900 a month, which is humbling
for a package whose core audience is analysts at state and local education
agencies.
Like a lot of long-lived personal packages, it accreted. Every time I needed a
small helper that didn’t exist yet — strip a comma out of a number, take the
mode of a vector, pad a leading zero, throw a quick lm diagnostic plot — it
went into eeptools. That made sense in 2013, when the tidyverse was nascent and
readr::parse_number() didn’t exist. But now, even as the package author,
I never use it for any of those functions - there are simply better ways to do
the job in 2026.
The job of a maintained package is not to do everything; it is to do its one thing well, for a long time, without surprising you.
eeptools has effectively been in maintenance mode for years
already — I just never announced it. The last release to add a genuinely new
feature was 1.2.0 in May 2018, which introduced isid(). Every release in the
eight years since has been a bug fix or a compatibility patch: quieting unit tests
under a new random-number generator, dropping mapping functions when maptools
was archived, and chasing roxygen and CRAN-check changes to keep the package
installable. And to be honest, I haven’t reached for most of these functions in years.
So I’m making the stance official rather than leaving it implied: eeptools is feature-complete and in long-term support. The time has come to clean it up, streamline it for long-term maintenance, and remove the features I’m almost certain no one is using. 1.3.0 is the first step; what that commitment means in practice — and where it ends — is further down .
What eeptools is for #
The part that is staying is the handful of functions that
are genuinely specific to education unit-record data and still annoying to do by
hand. There are three *_calc functions at the center of it.
age_calc() computes age between two dates, correctly accounting for leap years
and leap seconds — the kind of thing you need when you compute a student’s age
on a cut date from a date-of-birth field:
age_calc(dob = as.Date("1995-01-15"), enddate = as.Date("2003-02-16"), units = "years")
#> [1] 8.087671
age_calc(dob = as.Date("1995-01-15"), enddate = as.Date("2003-02-16"), units = "months")
#> [1] 97.03571retained_calc() takes student IDs and grade levels and returns, for the grade
you care about, who was held back:
x <- data.frame(sid = c(101, 101, 102, 103, 103, 103, 104, 105, 105, 106, 106),
grade = c(9, 10, 9, 9, 9, 10, 10, 8, 9, 7, 7))
retained_calc(df = x, sid = "sid", grade = "grade", grade_val = 9)
#> sid retained
#> 1 101 N
#> 2 102 N
#> 3 103 Y
#> 4 105 NAnd moves_calc() reads enrollment and exit dates to flag students who changed
schools within a year — the within-year mobility that doesn’t fall out of a
simple enrollment count.
Alongside the calc functions are the cleaning and checking helpers.
Administrative files arrive with redaction markers,
truncated identifiers, and no guarantee that the key you think is unique actually
is. isid() checks whether a set of variables forms a unique key before you
aggregate on it:
df <- data.frame(sid = c(1, 1, 2, 2),
year = c(2023, 2024, 2023, 2024),
gpa = c(3.1, 3.4, 2.8, 3.0))
isid(df, vars = c("sid")) #> [1] FALSE
isid(df, vars = c("sid", "year")) #> [1] TRUEremove_char() strips a marker such as the * used for redacted cells, leaving
NA so the column can become numeric; leading_zero() pads fixed-width codes
(school numbers, FIPS codes) that lost their zeroes when read as numbers:
remove_char(c(1, 5, 3, 6, "*", 2, 5, "*", "*"), "*")
#> [1] "1" "5" "3" "6" NA "2" "5" NA NA
leading_zero(c(1, 23, 7, 105), digits = 4)
#> [1] "0001" "0023" "0007" "0105"statamode() gives you the mode of a vector (numeric, character, or factor),
cutoff() and thresh() describe how concentrated a quantity is, and three
example datasets — stuatt, stulevel, and midsch — ship with the package
for teaching and for the R Bootcamp for Education
Analysts
. None of that is going
anywhere.
Keep the education-specific tools; let base R and the tidyverse have the rest.
What’s leaving, and what to use instead #
Eleven functions are now deprecated. They still work — they just emit a warning naming their replacement — and they will be removed in 2.0.0:
defac(factor(c("A", "B", "A")))
#> Warning message:
#> defac() is deprecated as of eeptools 1.3.0 and will be removed in a
#> future release. Use as.character() instead.
#> [1] "A" "B" "A"The full list, and where each one goes:
| Deprecated | Use instead |
|---|---|
defac() | as.character() |
makenum() | as.numeric(as.character()) |
decomma() | readr::parse_number() |
lag_data() | dplyr::lag() with group_by(), or data.table::shift() |
crosstabs() | janitor::tabyl() or dplyr::count() |
crosstabplot() | vcd::mosaic() or the ggmosaic package |
gelmansim() | the marginaleffects package, or merTools::predictInterval() |
autoplot.lm() | ggfortify::autoplot() or performance::check_model() |
cleanTex() | modern knitr/Quarto manage their own build artifacts |
profpoly(), profpoly.data() | niche assessment-band helpers; no direct replacement |
Every one of these was a reasonable thing to bundle a decade ago. Today each is either a one-liner in base R or a better-maintained function in a well-known package. Pointing you at those, rather than at my version, is the whole point of the release.
Defunct: the theme_dpi family
#
The old ggplot2 themes — theme_dpi(), theme_dpi_map(), theme_dpi_map2(),
and theme_dpi_mapPNG() — have been deprecated since the 1.2 series and are now
defunct. They error instead of warning:
theme_dpi()
#> Error: theme_dpi() is defunct. Use ggplot2::theme_bw() instead.These were a house style from a long time ago, and
ggplot2::theme_bw() plus your own tweaks is a far better foundation than a
frozen theme baked into this package.
A ggplot2 4.0 fix on the way out #
One function getting deprecated still got fixed first. autoplot.lm() — the quick
model-diagnostic plot — relied on ggplot2::fortify(), which 4.0 deprecated. So
even as it heads for the door, it was modernized to compute its diagnostics
directly from stats (fitted(), resid(), rstandard(), hatvalues(),
cooks.distance()) and to use the linewidth aesthetic in place of the
deprecated size on lines. It works cleanly under ggplot2 4.0 today; it will
still be removed in 2.0.0 in favor of performance::check_model(). Keeping a
deprecated function from throwing warnings of its own felt like the courteous
thing to do for the people who still call it.
What long-term support means here #
I will keep it working: CRAN compatibility, fixes for breaking
changes in R and ggplot2, dependency upkeep, and clear deprecation warnings so
nothing falls out from under you on a routine install.packages(). I won’t
add new features or expand the surface area. The API is frozen except to shrink
it.
And it will keep shrinking. This release is the first pass of a deliberate,
multi-release slimming, not a one-time cleanup. Deprecation will continue through
the 1.3.x line, and 2.0.0 will remove every deprecated and unsupported function
at once — taking the arm and vcd dependencies with it — and leave only the
education-specific tools. The goal is a small, durable package: the calculators,
the cleaning helpers, the key checks, and the teaching datasets, and not much else.
If you depend on any of the functions above in production code, now is the time to migrate — the warnings name exactly what to switch to, and the replacements are all better maintained than the originals. You have the whole 1.3.x series to do it; nothing disappears before 2.0.
Get it / cite it / file issues #
install.packages("eeptools")
citation("eeptools")- Source & issues: https://github.com/jknowles/eeptools
- Release notes: https://github.com/jknowles/eeptools/blob/main/NEWS.md
Thanks to Jason Becker, who contributed to eeptools in its earliest days, and to everyone who has filed an issue or a question over the past decade. A package that lasts long enough to need pruning is a lucky one.