Skip to the content Back to Top

Java is undergoing some considerable licensing changes, prompting us to plan an all-out move from Oracle Java 8 to OpenJDK Java 11 this Spring for every Solr instance we host. I have been running covertly about the hills setting traps for Java 11.0.1 to see what I might snare before unleashing it on our live servers. I caught something this week.

Dates! Of course it's about parsing dates! I noticed that the Solr Data Import Handler (DIH) transforms didn't handle making created dates during ingest. (In DIH, we use a script transformer and manipulate some Java classes with javascript. This includes the parsing of dates from text.) Up until now, our DIH has used an older method of parsing dates with a Java class called SimpleDateFormat. If you look for info on parsing dates in Java, you will find years and years of advice related to that class and its foibles, and then you will notice that in recent times experts advise using the java.time classes introduced in Java 8. Since SimpleDateFormat didn't work during DIH, I assumed that SimpleDateFormat was deprecated in Java 11 (it isn't actually), and moved to convert the relevant DIH code to use java.time.

Many hours passed here, during which the output of two lines of code* made no goddamn sense at all. The javadocs that describe the behaviour of java.time classes are completely inadequate, with their stupid little "hello, world" examples, when dates are tricky, slippery, malicious dagger-worms of pure hatred. Long story short, a date like '2004-09-15 12:00:00 AM' produced by Inmagic ODBC from a DB/Textworks database could not be parsed. The parser choked on the string at "AM," even though my match pattern was correct: 'uuuu-MM-dd hh:mm:ss a'. Desperate to find the tiniest crack to exploit, I changed every variable I could think of one at a time. That was how I found that, when I switched to Java 8, the same exact code worked. Switch back to Java 11. Not working. Back to Java 8. Working. WTF?

I thought, maybe the Nashorn scripting engine that allows javascript to be interpreted inside the Java JVM is to blame, because this scenario does involve Java inside javascript inside Java, which is weird. So I set up a Java project with Visual Studio Code and Maven and wrote some unit tests in pure Java. (That was pretty fun. It was about the same effort as ordering a pizza in Italian when you don’t speak Italian: everything about the ordering process was tantalizingly familiar but different enough to delay my pizza for quite some time.) The problem remained: parsing worked as-written in Java 8, but not Java 11.

I started writing a Stack Overflow question. In so doing, I realized I hadn't tried an overload method of java.time.format.DateTimeFormatter.ofPattern() which takes a locale. I had already dotted many i's and crossed a thousand t's, but I wanted to really impress anyone reading the question that I had done my homework, because I hate looking ignorant, so I wrote another unit test that passed in Locale.ENGLISH and, ohmigawd, that solved the problem entirely. If you have been following along, that means that "AM/PM" could not be understood by the parser, even with the right pattern matcher, without the context of a locale, and obviously the default locale used by the simpler version of DateTimeFormatter.ofPattern() was inadequate to the task. I tested and Locale.ENGLISH and Locale.US both worked with "AM/PM" but Locale.CANADA did not. Likely the latter is my default locale, because I do reside in Canada. Really? Really, Java? We have AM and PM here in the Great White North, I assure you.

I don’t know if this a bug in Java 11. I’m merely happy to have understood the problem at this point. Just another day in the developer life, eh? Something that should be a snap becomes a grueling carnival ride that deposits you at the exit, white-faced and shaking, with an underwhelming sense of minor accomplishment. How do you explain to people that you spent 8 hours teaching a computer to treat an ordinary date as a date? Write a blog post, I guess. Winking smile

* Two lines of code. 8 hours of frustration. Here it is, ready?


import java.time.LocalDateTime;
import java.time.format.DateTimeFormatter;
import java.util.Locale;

public class App {
  public LocalDateTime Parse (String dateText, String pattern) {
    DateTimeFormatter parser = DateTimeFormatter.ofPattern(pattern, Locale.ENGLISH);
    LocalDateTime date = LocalDateTime.parse(dateText, parser);
    return date;
  }
}

Local params are a way of modifying Solr's query parser within a query, setting all related parameters with a shorthand syntax. Super convenient for modifying query behaviour on the fly; local params are disabled by default in Solr 7.5, when employing the edismax query parser.

TL;DR

Modify the request handler in solrconfig.xml to add the so-called 'magic field' uf=_query_ back to the uf (User Fields) parameter to restore the kind of local params behaviour that was default prior to Solr 7.5.

<str name="uf">* _query_</str>

The above allows users to do fielded searches on any field, plus allows them to use local params in their queries.

Why Local Params? An Example

Say we are developers who want to use the MoreLikeThis feature of Solr. There are multiple ways of setting this up, as described in the Solr Reference Guide. But, say we are also developers who are using SolrNet to create requests to and handle responses from Solr. (As indeed is the case for use here at Andornot, in our Solr-backed Discovery Interface.)

One of SolrNet's strengths is that it maps Solr responses to strongly-typed classes. On the other hand, its weakness is that you can really only map query result documents to one strongly-typed class. (Not strictly true, but true from a practical, please-don't-make-me-do-contortions point of view.)

No response from Solr can deviate too far from that mapping. Other bits can be tacked on to the response and be handled by SolrNet (highlighting, spellcheck, facets, etc.), but these must be components that are somehow related to the context of the documents in the main response. In the case of MoreLikeThis, you have to set up the component so that each query result document generates a list of 'like' documents. Having to generate such a list for each document returned slows down the query response time and bloats the size of the response. Quite unnecessary, in my opionion. I much prefer to generate the list of 'like' documents on the fly when the user has asked for them. An easy way of doing that without messing with the SolrNet mapping setup is to use local parameters.

Say our user finds an intriguing book in their search results called "More Armadillos Stacked on a Bicycle". Perhaps, our user muses, this book is a sequel to a previous publication regarding such matters. They feel a thrill of anticipation as they click on a 'More Like This' link. (I know I would.)

{ 
   'id': 123, 
   'title': 'More Armadillos Stacked on a Bicycle', 
   'topic': [ 
      'armadillos', 
      'fruitless pursuits', 
      'bicycles' 
   ] 
}

When using local params, the 'More Like This' query can use the same Solr request handler and all the parameters embedded within it, but swap out the query parser for the MLTQParser. The bits that are needed to complete the MoreLikeThis request are passed in via local param syntax, still within the main query parameter! (Perhaps I did not need that exclamation mark, but the armadillos-upon-a-bicycle adrenaline has not yet worn off.)

/select?q={!mlt qf=title,topic mintf=1 mindf=1}123

The local params syntax above says "find other documents like id=123 where extracted keywords from its various fields find matches in title and topic." The convenient part for the developer using SolrNet is that the response maps neatly to the kind of response we expect from a regular query: a set of result documents mapped to a strongly-typed class, which makes the response easy to handle and display using existing mechanisms.

Why Not Local Params?

I suppose we can imagine a clever and malicious user who is able to use the power of local params to hack Solr queries in order to get at information that perhaps they otherwise shouldn't. If, as a developer, you need to ensure that users are limited in their scope, then disabling local params and even further locking down the uf (User Fields) parameter to deny certain fielded searches is right and good.

For some reason, DB/TextWorks menu screens are a little used feature. We often meet clients with many databases, but without a convenient way of seeing and accessing them all at a glance. Adding a menu screen to DB/TextWorks is quick and easy to do, but makes using your databases so much easier.

The screenshot above shows the menu screen from our Andornot Library Kit, with links to each of the many databases it includes. The one below shows one from one of our clients' systems.

What is a Menu Screen?

Like a Query Screen or Report Form in a DB/TextWorks database, a Menu Screen is a screen layout you create using the WYSIWYG designer in DB/TextWorks. You would usually add to it links to each of your databases, for searching or data entry. You might also add your organization's name or logo, contact or support info for anyone who might be using the system, a brief description of each database, etc.

Having links to all your databases on a single screen saves time and helps new users find their way around your collection of databases without having to hunt for them in folders on disk. It also allows you to specify, in each link to a database, which query screen and reports to load for that database. 

One way to create menu screens is to have different menu screens for different kinds of users. For example, in an archives or museum that relies on volunteers to help with data entry, you could have a menu screen for volunteers that only lists the Accessions database, and pre-loads a simpler query screen and data entry form designed specifically for volunteers. A more extensive menu could provide the archivist or curator with links to all databases, pre-loading the more sophisticated query and edit screens for their use.

Unlike a Query Screen or Report Form, the menu screen isn't stored in any one database, but rather as a separate file on disk (with a .tbm or .cbm extension). You would usually store it in the same folder as all your database files.

How do I create a Menu Screen?

  1. Open DB/TextWorks but don't open a database.
  2. Select Menu Screens > Design from the main menu.
  3. Choose "Create a New Menu Screen File."
  4. Browse to the folder where your databases are stored to save the menu screen in the same location, and give it a name.
  5. In the WYSIWYG Menu Screen Designer, you may now add links to textbases, your organization's name or logo, and other information. Use the examples above for ideas, or come up with your own design.
  6. To add links to textbases, choose Edit > Add > Textbase box.
  7. In the Textbase Properties Dialogue, select the textbase to link to, then on the Initial Elements tab, pre-select the query screen and forms to use by default. Note that these override the default screens and forms set in the textbase, and that in either case, users may still change to other screens and forms once they are in the database.
  8. On the Initial Action tab, be sure to select which window to open. For example, if your link is one such as "Search the Database", select a Query Window. If your link is "Add a New Record", select Edit New Record as the window to open.
  9. Save your new menu screen when your design is complete.
  10. If you ever create more than one menu screen, you can even add links from one to another on each of them.

How do I use a Menu Screen?

  1. On each PC that has DB/TextWorks, open DB/TextWorks but don't open a database.
  2. Select Menu Screens > Select from the main menu.
  3. Choose "Use the Menu Screen in a File", then browse to and select the Menu Screen file (ending with .tbm or .cbm) that you created earlier, usually stored in the same folder as your databases.
  4. Close and re-start DB/TextWorks and your menu screen will now automatically load, ready for use.

See this blog post from earlier this week about two other helpful but little used features of DB/TextWorks: Sets and Record Skeletons.

A recent project has reminded me that many clients are not aware of the power of these three functions that have been available in DB/TextWorks for years, and which can potentially streamline and speed up your workflow.

The first is Menu Screens.  Many clients have a menu screen that loads up when they open DB/TextWorks but usually the ones we see are either the default from the old Inmagic Library Module, or rudimentary boxes linking to their databases. However they can be so much more useful! Here is an example from a recent project.

CLGA menu

A menu screen is super easy to set up and we’ll be posting a detailed guide here in our blog soon.

However first we need to discuss the other two functions, as they can be used separately or in conjunction with your menu screen.

The second function is Sets. Whenever you do a search you can choose to Save the Set from the top toolbar. Sets are a great way of providing quick access to a search with several parameters to save you from entering them each time using the query screen. So for example, find all records with a Review date in the next 30 days; or find digital image records that have been entered but not checked yet; or find all books that are not on permanent loan and that have been out for more than 60 days. You can use the @date variable in the search strategy without needing to actually input an actual date each time. Never used the @date function? It can be very handy especially when combined as in @date-7:@date which retrieves all dates within the past week.   A Sets box can be added to your query screen to give you quick access to running these searches or they can be embedded in your menu screen.

The third function is Record Skeletons. You may have a student or volunteer adding records for reports in particular series; or images in a photographic collection; or documents in a fonds. You can create a record skeleton to prepopulate the edit screen with publication or descriptive data that is common to all these new records. You can find Skeletons under the Records menu. Note that once you select a skeleton to use, it will be the default until you re-set to none, or choose a different one.

In the menu screen example above, every database has a link to the search screen plus a link straight in to a new record edit screen. If your database has several edit screens these can be specified on the menu screen too, as well as specifying a skeleton appropriate for these new records. It may not seem like much, but this can save a couple of extra clicks and let you get straight to work. This screen also has Sets specified to prepopulate the query screen with the value for a particular collection. So easy to set up and a great way to ensure people can search quickly and effectively.

Check out more tips and tricks for getting the most out of DB/TextWorks in our blog archive:

We are always available to help you with updates to your databases. No project is too small!

Library and Archives Canada has announced the launch of the 2019 funding cycle for the Documentary Heritage Communities Program (DHCP). This is the fifth round of what was originally envisioned as a five year program, so could potentially be the final year.

The DHCP provides financial assistance to the Canadian documentary heritage community for activities that:

  • increase access to, and awareness of, Canada’s local documentary heritage institutions and their holdings; and
  • increase the capacity of local documentary heritage institutions to better sustain and preserve Canada's documentary heritage.

The deadline for submitting completed application packages is January 8, 2019.

This program is a great opportunity for archives, museums, historical societies and other cultural institutions to digitize their collections, develop search engines and virtual exhibits, and other activities that preserve and promote their valuable resources.

There are a number of significant changes this year:

  • The upper limit of funding for a small project has increased to $24,999. Many of the projects Andornot helps with would fall into this range.
  • Organizations which receive up to half their funding from government sources are now eligible.

Types of projects which would be considered for funding include:

  • Conversion and digitization for access purposes; 
  • Conservation and preservation treatment; 
  • The development (research, design and production) of virtual and physical exhibitions, including travelling exhibits; 
  • Conversion and digitization for preservation purposes; 
  • Increased digital preservation capacity (excluding digital infrastructure related to day-to-day activities); 
  • Training and workshops that improve competencies and build capacity; and 
  • Development of standards, performance and other measurement activities. 
  • Collection, cataloguing and access based management; and 
  • Commemorative projects.

Lists of the grants and recipients in the previous four rounds of funding are available here and may help you as you think about your own application.

Further program details, requirements and application procedures are available at http://www.bac-lac.gc.ca/eng/services/documentary-heritage-communities-program/Pages/dhcp-portal.aspx

How can Andornot help?

Many Andornot clients have obtained DHCP grants in previous rounds, and Andornot has worked on many other projects which would qualify for this grant. Some examples are detailed in these blog posts:

We have extensive experience with digitizing documents, books and audio and video materials, and developing systems to manage those collections and make them searchable or presented in virtual exhibits.

Contact us to discuss collections you have and ideas for proposals. We'll do our best to help you obtain funding from the DHCP program!

Also also check out a few other grants that are open this fall in this blog post: "Grants with Fall 2018 Application Deadlines"

Let Us Help You!

We're Librarians - We Love to Help People