Device Integration
General concept
The principles for device integration are described in: https://doi.org/10.1016/j.acax.2019.100007 Several descriptions and images are derived from the above mentioned manuscript published in Analytica Chimica Acta X and the assigned supplemental information.
Referring to the diverse origin of data in chemistry labs, we implemented three different procedures to allow the availability of data to the Chemotion ELN server (Figure 1). The first procedure includes the transfer of data (preferably small data) by a mailing system which manages the capture of NMR data and additional information that is sent by mail to the email account of the Chemotion ELN. A second procedure was designed for the transfer of data from a local server or computer with network connection to the Chemotion ELN server. The last procedure was implemented for the transfer of data from devices without network access (due to safety standards, location issues or software-related issues). For the transfer of data from devices without network connectivity, we implemented a procedure which requires the manual collection of data (for example with a portable data storage e.g. an USB flash drive) and its transfer to the Chemotion ELN via WinSCP running on a computer with network connection.
Figure 1: Overview of the three procedures to integrate research data from (analytical) devices to an Chemotion ELN. The given analytical methods are chosen as examples to illustrate the procedure. They can be exchanged depending on the local infrastructure.
General requirements concern in general the device and Chemotion ELN specific adaptations, therefore, the following documentation contains always aspects of Chemotion ELN setting adaptation and device specific settings. The Chemotion ELN specific adjustments depend on the use of a mail-collector (used to collect data transferred by email to a mail-server) or a datacollector (used to collect data transferred to a server) as a basic process for the desired data transfer.
General Preparation aspects
Primarily the integration of devices depends on several factors as,
- Operating system of the PC
- Vendor software of the device
- Network connection (What IP address and what type of network)
- File types etc
Some basic requirements to integrate the devices are as follows,
- Operating system: Windows (Minimum Windows XP. No Windows 98 or Windows 95), preferably Windows 10 or Windows 11. Linux based system is also possible
- Network: Access to the KIT intranet. Either a static IP address or FQDN (fully qualified domain name) is required
- Administrator rights to the system’s operating system
Name conventions
To be able to transfer data and assign data to a users Chemotion ELN-account, users have to follow strict rules regarding the labels/names of measurements. Measurements have to be submitted in a form: (1) "UserShortLabel"-"number""additional" or (2) "groupID""UserShortLabel"-"number""additional"
"UserShortLabel" = Label for each user given by registration to the Chemotion ELN
"number" = any integer
"additional" = any additional information. Valid are numbers, letters (case insensitive), hyphens.
"groupID" = number given as identifier for a distinct user group in the Chemotion ELN.
Examples for (1): NJ-101; NJ-101-A; NJ-101-5; NJ-101_pure; ABCD-1234-B Also, if many groups are using the Chemotion ELN, a groupID may help to sort users and therefore their measurements according to a group ID. The measurements should then be named according to examples (2): 2NJ-101; 2NJ-101-A; 2NJ-101-5; 2NJ-101_pure; 2ABCD-1234-B. Please take care that the hyphen is included!
Mail-collector
One of the easiest possibilities to transfer data to a management system such as the Chemotion ELN is to send data per email. The software of some analytical devices support the mailing of measurements to a user defined email address by default. This allows to implement storage routines that are independent of any additional software or hardware on the instrument’s side. The only prerequisite for a use of the MailCollector routine is fulfilled if one can set the email to be send to two addresses, to the Chemotion ELN server associated email address and to the user’s email address registered with the corresponding Chemotion ELN account (Figure 2). This is necessary to identify the target of the mailed information and the corresponding account in the Chemotion ELN. In short, four configuration steps are necessary to fetch the information via email:
• First, it is necessary to have an email account for the Chemotion ELN with any provider or server that supports Internet Message Access Protocol (IMAP) to centrally collect the emailed data.
• Second, the device software has to be configured to send the data to the user’s (scientist’s) email account as a main recipient and to this Chemotion ELN email address as a main recipient or in cc.
• Third, the devices sending emails should be whitelisted by the Chemotion ELN server. For this, the devices information is persisted in the Chemotion ELN DB as ‘device’ entities. The device class is a subclass of the Chemotion ELN User class and each device’s email can then be registered (step 3, Figure 2).
• The last step for the implementation of the mailing routine is the activation of the MailCollector service within the Chemotion ELN (this is also done by the administrator of the Chemotion ELN server). For this, the credentials (email address and password) and email server IMAP settings for the Chemotion ELN mail account as well as the cron time schedule parameters are configured (step 4, Figure 2).
The MailCollector (see Figure 2 for the used routine) screens the Chemotion ELN account for new emails and processes the data subsequently. The processing follows the routine: inspect – store – delete. New emails are inspected with respect to the sender (device) and the recipient (user). If both are known to the MailCollector, the service adds the data in the email’s attachment to the Chemotion ELN server and deletes the mail from the inbox of the Chemotion ELN email account. The MailCollector can be used e.g. for the transfer of NMR data from Bruker instruments.
Additionally, the mailing procedure is very helpful if the information to be added to an Chemotion ELN is captured with a mobile device or is available on a computer. In this manner, almost all data, independent of its origin, can be transferred to the Chemotion ELN. The only step necessary for this procedure is sending an email (using the same email address the user is registered within the Chemotion ELN) to the Chemotion ELN’s email account. Data that is attached to the email will be processed according to the MailCollector as described above and will be assigned to the user Chemotion ELN account according to the email address of the sender (Figure 2).
Figure 2: Data collection via the MailCollector routine implemented exemplarily for Bruker NMR spectrometers and additional information sent by email. Configuring the data transfer via the MailCollector.
Datacollectors for devices with LAN/WLAN
There are two possibilities to enable an automatic transfer of data to the Chemotion ELN from devices that store data on a local computer or server. Either the data is actively transferred via a client application from the local computer/server to the Chemotion ELN server or the Chemotion ELN server directly accesses the device data storage folder. The first solution requires more efforts for the installation and maintenance of several client applications, therefore, we followed a central solution (Figure 3A) where both, the device computer and the Chemotion ELN server, have access to a common storage. Considering the availability of transfer software packages for the server and for the device computers, we focused on SFTP over NFS, CIFS/SAMBA, FTPS to securely transfer files. SFTP is relatively well supported and free third-party software and can be found even for legacy operation systems (OS) such as windows XP. Though the latter OS supports mapping of the network drive, the obsolete and unsafe SMBv1 protocol cannot be accepted. Having such an OS in a network is nowadays a high security issue. These issues can be circumvented by entirely isolating the computer and device network and configuring a firewall exception for ssh communication with the external remote storage. To allow the Chemotion ELN to access the data, four basic adaptions are implemented:
• In a first step, the device has to be prepared to enable data storage on a remote folder (SFTP server) in addition to a data storage on a local computer or server. This configuration of the devices can be easily done if the instrument offers this option by default. In all other cases, a few changes have to be made according to the description given below in the subchapter “necessary device adaptions” (step 1, Figure 3B).
• Second, folders for each device are created on a remote network drive. File access credentials are also generated. In our implementation, the central storage was provided by the Large Scale Data Facility, LSDF, in Karlsruhe, and a read/write account was created for each device folder. The results of the measurements are mirrored on this central storage in the folder reserved for the device. In addition, a copy of the raw data is being kept locally (step 2, Figure 3B).
• In a third step, the device information along with a specific profile are registered to the database of the Chemotion ELN as described in Procedure 1. The necessary information added to the new device entry in the DB includes the network drive (or server) for the remote folder, the corresponding path and the access credentials (step 3, Figure 3B).
• The last step of the implantation consists of the activation of a DataCollector through the corresponding configuration file of the Chemotion ELN server (step 4, Figure 3B). Similar to the MailCollector, it requires a cron schedule type parameter at which the DataCollector becomes active. It also accepts the network credentials for sftp password authentication (instead of storing the passwords, it is also possible to configure rsa key authentication).
With the DataCollector activated, all data stored on remote accounts are processed at the set times. This routine works for all instruments that are connected to the collector and belong to the devices that are registered in the Chemotion ELN. The DataCollector is a routine that works similar to the MailCollector described in Procedure 1, but uses, in contrast to the MailCollector, SFTP.
Two different DataCollector types (FileCollector and FolderCollector) were developed and are used depending on whether the device stores measurement data either as a single file or as a folder including a defined number of objects. The FileCollector reads single data files which were written into the predefined remote folder, while the FolderCollector inspects the remote folder of a device for new sub-folders (the codes of the FileCollector and FolderCollector are given in the Supporting Information). It is critical for this mode to know if the measurements have been completed and all files have been written. This can be achieved by defining the number of files that have to be present within a data folder for a completed measurement. If all data has been written, the expected file number is reached and the folder can be processed and assigned to an Chemotion ELN user. All files are compressed in a ZIP folder and become accessible through the Chemotion ELN account of the identified Chemotion ELN user.
To allow the assignment of the data collected by the DataCollectors to a specific Chemotion ELN user, only a few rules have to be respected: the name of the file has to start with a unique identifier registered with the Chemotion ELN user account. This identifier has to be separated from the experiment number via a hyphen or a period (e.g. JD-xyz for the registered user John Doe and a measurement xyz). If the initials can be assigned to a registered Chemotion ELN user, the data is registered in the Chemotion ELN account for the identified user. The processing of data by the DataCollectors follows the same routine “inspect – store – delete” as used in case of the MailCollector. All data that is found by the DataCollectors within the remote folders is transferred to the Chemotion ELN server and deleted from the remote folder. Deleting successfully transferred data is of high importance to maintain a good performance of the Chemotion ELN and data transfer server, as the amount and size of datasets increase with the running time and the resources required for their inspection increases as well. The deletion of the data in the remote folder is also done if no user of the Chemotion ELN can be assigned to the datasets. All changes are documented in a log file.
Figure 3: Data collection via DataCollectors (FileCollector and FolderCollector) as implemented for the devices GC (MS), LC (MS), Raman and mass spectrometer (general labeling as Device 1 to Device 4). Right: Device configuration for data transfer via additional file copies as described in detail in the following chapter 4. Workflow of the installation of a data transfer routine via a DataCollector.