Oracle has released Tribuo, a machine learning library for Java, open source under the Apache 2.0 License on GitHub. The library has its origin in the Oracle Labs Machine Learning Research Group and is probably already used for several years in production within Oracle.
The researchers want it to respond to the observation that coarse software systems want to use building blocks that describe themselves and know when their inputs or outputs are unsound. Most ML frameworks do not adequately address this requirement with their model processing, according to Oracle. Because most ML libraries expect a stack of float arrays to train a model.
Finally, researchers see the scenario that tracking models in production is also difficult because it requires external systems to maintain the link between a deployed model and the training process and data. For usual, the burden of these additional requirements falls on the teams that integrate ML libraries into their products or systems. It would probably be better to embed this task in the ML library itself.
After all, most ML libraries are written in dynamically typed languages like Python and R, while most enterprise systems are written in a statically typed language like Java. As a result, developers see significant code maintenance and system overhead even when implementing simple ML components, as code must be written in multiple languages and run in multiple runtimes.
What Tribuo is and can do?
Tribuo offers standard ML features such as classification, clustering, anomaly detection and regression algorithms. Tribuo also has pipelines for data and text processing, as well as so-called feature-level transformations. The library offers interfaces to ONNX Runtime as well as to the ML libraries TensorFlow and XGBoost. This allows models stored in ONNX format or trained in TensorFlow or XGBoost to be used alongside Tribuo’s native models. This also allows to create models in Java that were previously trained with Python packages like Scikit-Learn and PyTorch. The TensorFlow support is still rather experimental in nature.
Tribuo, which is written in Java, requires Java 8 or higher. All relevant information and documentation as well as tutorials and introduction guides are available on the Tribuo website. The corresponding blog post also motivates the library.