Skip to the content Back to Top

Java is undergoing some considerable licensing changes, prompting us to plan an all-out move from Oracle Java 8 to OpenJDK Java 11 this Spring for every Solr instance we host. I have been running covertly about the hills setting traps for Java 11.0.1 to see what I might snare before unleashing it on our live servers. I caught something this week.

Dates! Of course it's about parsing dates! I noticed that the Solr Data Import Handler (DIH) transforms didn't handle making created dates during ingest. (In DIH, we use a script transformer and manipulate some Java classes with javascript. This includes the parsing of dates from text.) Up until now, our DIH has used an older method of parsing dates with a Java class called SimpleDateFormat. If you look for info on parsing dates in Java, you will find years and years of advice related to that class and its foibles, and then you will notice that in recent times experts advise using the java.time classes introduced in Java 8. Since SimpleDateFormat didn't work during DIH, I assumed that SimpleDateFormat was deprecated in Java 11 (it isn't actually), and moved to convert the relevant DIH code to use java.time.

Many hours passed here, during which the output of two lines of code* made no goddamn sense at all. The javadocs that describe the behaviour of java.time classes are completely inadequate, with their stupid little "hello, world" examples, when dates are tricky, slippery, malicious dagger-worms of pure hatred. Long story short, a date like '2004-09-15 12:00:00 AM' produced by Inmagic ODBC from a DB/Textworks database could not be parsed. The parser choked on the string at "AM," even though my match pattern was correct: 'uuuu-MM-dd hh:mm:ss a'. Desperate to find the tiniest crack to exploit, I changed every variable I could think of one at a time. That was how I found that, when I switched to Java 8, the same exact code worked. Switch back to Java 11. Not working. Back to Java 8. Working. WTF?

I thought, maybe the Nashorn scripting engine that allows javascript to be interpreted inside the Java JVM is to blame, because this scenario does involve Java inside javascript inside Java, which is weird. So I set up a Java project with Visual Studio Code and Maven and wrote some unit tests in pure Java. (That was pretty fun. It was about the same effort as ordering a pizza in Italian when you don’t speak Italian: everything about the ordering process was tantalizingly familiar but different enough to delay my pizza for quite some time.) The problem remained: parsing worked as-written in Java 8, but not Java 11.

I started writing a Stack Overflow question. In so doing, I realized I hadn't tried an overload method of java.time.format.DateTimeFormatter.ofPattern() which takes a locale. I had already dotted many i's and crossed a thousand t's, but I wanted to really impress anyone reading the question that I had done my homework, because I hate looking ignorant, so I wrote another unit test that passed in Locale.ENGLISH and, ohmigawd, that solved the problem entirely. If you have been following along, that means that "AM/PM" could not be understood by the parser, even with the right pattern matcher, without the context of a locale, and obviously the default locale used by the simpler version of DateTimeFormatter.ofPattern() was inadequate to the task. I tested and Locale.ENGLISH and Locale.US both worked with "AM/PM" but Locale.CANADA did not. Likely the latter is my default locale, because I do reside in Canada. Really? Really, Java? We have AM and PM here in the Great White North, I assure you.

I don’t know if this a bug in Java 11. I’m merely happy to have understood the problem at this point. Just another day in the developer life, eh? Something that should be a snap becomes a grueling carnival ride that deposits you at the exit, white-faced and shaking, with an underwhelming sense of minor accomplishment. How do you explain to people that you spent 8 hours teaching a computer to treat an ordinary date as a date? Write a blog post, I guess. Winking smile

* Two lines of code. 8 hours of frustration. Here it is, ready?


import java.time.LocalDateTime;
import java.time.format.DateTimeFormatter;
import java.util.Locale;

public class App {
  public LocalDateTime Parse (String dateText, String pattern) {
    DateTimeFormatter parser = DateTimeFormatter.ofPattern(pattern, Locale.ENGLISH);
    LocalDateTime date = LocalDateTime.parse(dateText, parser);
    return date;
  }
}

Comments


Comments are closed

Let Us Help You!

We're Librarians - We Love to Help People