Comparing Java libraries for sunrise/sunset calculation

Author

Klaus Brunner

Published

2025-02-24

How accurate are some of the available Java libraries to calculate sunrise and sunset times? This is not an exhaustive analysis; the idea is to get an estimate of what range of deviations to expect.

All sunrise/sunset calculators are based on more or less detailed models of the solar system. Better models usually result in better accuracy, especially for locations near the poles, where the sun’s light meets Earth’s surface at a shallow angle. However, it’s impossible to predict truly exact times even with the best models: local topography and varying atmospheric refraction have a significant effect on when sunrise or sunset is observed. As Jean Meeus, author of Astronomical Algorithms and frequently cited authority on the topic, puts it: “giving rising or setting times .. more accurately than to the nearest minute makes no sense.”

The contenders in this test are:

As a reference, the following comparison uses data for the year 2020 in various locations around the world retrieved from the US Naval Office’s website. The data has been reduced to three days in each month (the 5th, 15th, and 25th), resulting in a sample of 36 days for each location. All times are in UTC and rounded to the nearest minute; elevation is not taken into account. The Java code can be found at https://github.com/klausbrunner/sunrise-comparison.

Results

Let’s have a first look.

Code
library(tidyverse)
library(knitr)

data <- read_table("results.txt", col_types = "Dffftt", na = "null") |>
  pivot_longer(cols = starts_with("sun"),
               names_to = "event",
               values_to = "time")

ggplot(data = data, aes(
  x = date,
  y = time,
  color = event,
  linetype = algo
)) +
  geom_line() +
  scale_x_date(date_labels = "%b") +
  facet_wrap(~ location)

The differences between the libraries and USNO’s data can’t be too dramatic: it’s hard to discern any individual lines except in the most remote locations: Longyearbyen, Norway at about 78° Northern latitude, and McMurdo station in Antarctica (77.85° Southern latitude).

Looking at the differences and their distribution more closely:

Code
ref_data <- data |>
  filter(algo == "USNO") |>
  select(date, location, event, time) |>
  rename(reference_time = time)

comparison_data <- data |>
  filter(algo != "USNO") |>
  left_join(ref_data, by = c("date", "location", "event"))

comparison_data <- comparison_data |>
  filter(!(is.na(time) & is.na(reference_time))) |>
  mutate(time_diff = if_else(is.na(time) |
                               is.na(reference_time), NA, abs(
                                 difftime(time, reference_time, units = "secs")
                               )))

summary_results <- comparison_data |>
  group_by(algo) |>
  summarise(
    avg_time_diff = round(mean(time_diff, na.rm = TRUE), 1),
    max_time_diff = round(max(time_diff, na.rm = TRUE), 1),
    disagreements = sum(is.na(time_diff))
  )

kable(summary_results)
algo avg_time_diff max_time_diff disagreements
SPA 3.2 secs 120 secs 0
SRSL 61.1 secs 960 secs 0
commons 71.2 secs 1140 secs 4

Whenever a library and USNO disagree about whether a sunrise or sunset occurs on a given day at all, it’s counted as a disagreement.

Code
comparison_data |>
  ggplot(aes(x = time_diff)) +
  geom_histogram(binwidth = 60) +
  facet_grid(vars(algo))

Where do the disagreements and outliers come from?

Code
kable(comparison_data |> filter(is.na(time_diff)))
date location algo type event time reference_time time_diff
2020-02-15 Longyearbyen commons NORMAL sunrise 10:30:00 NA NA secs
2020-02-15 Longyearbyen commons NORMAL sunset 11:56:00 NA NA secs
2020-10-25 Longyearbyen commons ALL_NIGHT sunrise NA 09:50:00 NA secs
2020-10-25 Longyearbyen commons ALL_NIGHT sunset NA 11:31:00 NA secs
Code
kable(comparison_data |> filter(time_diff > 600) |> arrange(desc(time_diff)))
date location algo type event time reference_time time_diff
2020-04-15 Longyearbyen commons NORMAL sunrise 00:19:00 00:38:00 1140 secs
2020-08-26 Longyearbyen commons NORMAL sunrise 00:35:00 00:16:00 1140 secs
2020-10-25 Longyearbyen SRSL NORMAL sunset 11:15:00 11:31:00 960 secs
2020-02-25 McMurdo commons NORMAL sunset 10:36:00 10:48:00 720 secs
2020-08-25 McMurdo commons NORMAL sunset 03:34:00 03:22:00 720 secs
2020-10-15 McMurdo commons NORMAL sunset 10:18:00 10:06:00 720 secs
2020-02-25 Longyearbyen commons NORMAL sunrise 08:00:00 08:11:00 660 secs

As expected, it’s the locations in the far North and South. Somewhat less expected: the dated sunrisesunsetlib performs a bit better than commons-suncalc in this sample. SPA (solarpositioning) never deviates more than 2 minutes from the reference, if at all.

Code
comparison_data |>
  ggplot(aes(x = time_diff, fill = algo)) +
  geom_histogram(binwidth = 60) +
  facet_grid(vars(location))

Days of silliness: transitioning in and out of solar days and nights

The transition phase in and out of solar days and nights is a challenge for all algorithms. Even respected sources like USNO or NOAA may give widely varying or contradictory results for one or two days when this happens, such as multiple sunsets in a row with no sunrise in between. This isn’t terribly surprising. Let’s look at the situation in Longyearbyen in mid-April 2025 as an example, taking zenith angle data from SPA (getting these as CSV tables is really easy with solarpos). With and without refraction correction, to make things more interesting.

Just how many sunsets are there? How many would a human observer recognise as such?

Code
pos <- read_csv("longyearbyen-position.csv") |> mutate(refraction = "uncorrected")
posCorrected <- read_csv("longyearbyen-position-refraction-corr.csv") |> mutate(refraction = "corrected")
pos <- bind_rows(pos, posCorrected)

ggplot(data = pos, aes(x = dateTime, y = zenith, colour = refraction)) +
  geom_line() +
  scale_y_reverse() +
  geom_hline(yintercept = 90,
             linetype = "dotted",
             color = "blue")

The “windowed mode” of commons-suncalc is a clever way to deal with this: you can set the search radius for the previous or next event as needed.

Summary

All the libraries perform well enough for many purposes as long as the locations are safely between the polar circles: you can expect deviations of no more than a few minutes. For best accuracy on your next Java-powered polar expedition, I’d suggest to pack solarpositioning (or the SPA algorithm in general).