wiki:Arm4Datastores

ARM 4 Datastores

Not all datastores are appropriate for all environments. Take a few moments to review the features of each and decide which is appropriate for your application.

Supported Datastores

Berkeley DB

The Berkeley DB is the default datastore and offers the best collection performance at the cost of increased complexity. Insertions into the datastore are fast and will have a minimal impact on application performance and is therefore a good choice for testing and development environments. While these are also desirable in a production environment, interactive queries to the datastore are difficult and often expensive. They may also lock tables which can have an impact on the applications themselves.

Checkpointing

Checkpointing is an operation that allows the Berkeley DB to recover from application failures. Unfortunately when a checkpointing operation is occurring, applications that have their ARM requests queued may be blocked. This can have a significant impact on performance. It is possible to turn checkpointing off using the configuration file, but be aware that this may cause data to be lost in the event of power failure or unclean shutdown. The arm4_control program has a checkpoint command that will allow you to have greater control over when checkpoints occur, for example by being run as part of a regular cron job.

The database is always checkpointed when an archive is created or the database is shutdown using arm4_control.

Archiving

An archive of collected data can be created using the arm4_control archive command. This generates a copy of all definitions and transaction information in the backup directory specified in the arm4.conf file. Transaction information is cleared from the current datastore, but application, metric, and transaction definitions are retained. This allows the databases to be examined offline without impacting running applications.

Sqlite3

One of the main advantages in using Sqlite3 is that it supports a rich set of SQL queries, making examination of the collected data much easier. The main disadvantage is collection performance. The performance of Sqlite3 is significantly less than that of the Berkeley DB datastore, and this can impact running applications. This can be mitigated to some degree by sampling transactions instead of collecting all transaction instances, but it may still have an impact on your running system.

One common way to use the Sqlite3 datastore is to collect using Berkeley DB, and convert the Berkeley database to a Sqlite3 database for analysis. The Datastore Conversion section gives details on how to do this.

Build Considerations

Many Linux distributions will include a version of Sqlite3 built with the --enable-threads option, but this is insufficient for correct operation. The current daemon requires that Sqlite3 also be built with --enable-cross-thread-connections. If your library wasn't built with this option then you'll get errors indicating that the library is being called out of sequence.

Datastore Conversion

Conversion from one datastore to another uses user space configuration files and instances. See Arm4UserGuide for more information on how to use this.

  • Export the current database using arm4_control export all or arm4_export redirecting the output to a file (e.g. exported_datastore.xml)
  • Create a use space instance that will contain the converted datastore

e.g. using configuration file conversion_arm4.conf

instance = 1
db = sqlite
db_home_dir = /tmp/conversion
  • Import the exported datastore XML into the new datastore
$ arm4_control import exported_datastore.xml