fabiola-logo

Description

FABIOLA enables Constraint Optimisation Problems (COPs) in large datasets by using Big Data techonologies in a user-friendly way. It enables to (1) create COP models; (2) integrate different data sources; (3) map dataset attributes to COP-model variables; (4) solve the COPs in a distributed way, and (5) perform advanced queries on the results.

The FABIOLA Big Data layer is based on Apache Spark. The most recent version relies on the COP solver choco-solver. The user interface is composed of a REST API implemented in NodeJS, and a front-end based on AngularJS.

The architecture fullfils the principles of low coupling and high cohesion. The communication among the different modules which compose this architecture is performed through REST APIs. In this way, all components are highly independent, and modifying or scaling any of them causes a low impact on the others.

The deployment of the Big Data layer is performed in a DC/OS cluster. It provides a highly elastic environment, since extending or reducing the amount of availables nodes is a very easy and transparent process. On the other hand, the backend and backend were deployed with Docker images.

fabiola-architecture

fabiola-deployment

A quick tour

Next, a quick tour on the features of this tool is presented.

Managing datasets

01_LIST_DATASETS

This is the view where users can see their datasets.

 

02_SHOW_DATASET

By clicking the show button on a dataset, the user will get a detailed view on the features of the dataset. Note the validate button. By clicking that, the system will query such dataset and check if it correct. If so, the status will be validated.

 

03_SHOW_DATASET_ERROR_MSG

This is an example of a bad dataset configuration. After being validated, it shows that it could not be validated and the cause of that.

 

05_CREATE_DATASET

Creating a dataset is very simple. Users might import their datasets from their own data sources. FABIOLA can read from several storage systems such as HDFS and MongoDB.

06_CREATE_DATASET_LOCAL

Users also might upload a dataset from their own computer.

Creating COP Models

07_LIST_COP_MODELS

In this view, the user can get a list of all the COP models defined by them.

09_CREATE_COP_MODEL

This is an example of a COP model. These are defined in Scala by using the library choco-solver.

 

Managing instances

10_LIST_INSTANCES

In this view, all instances are listed. An instance represents an execution, and is related to a dataset and a COP model.

12_SHOW_NOT_STARTED_INSTANCE

This is a detailed view of an instance that has not been executed yet.

 

11_SHOW_INSTANCE

This is a detailed view of an instance that has been executed. By clicking the results button, the user can access to the querying tool of FABIOLA.

13_CREATE_INSTANCE_SELECT_COP_MODEL

This is the creation process of a new instance. First, the COP model must be selected.

 

14_CREATE_INSTANCE_SELECT_DATASET

The next step is to select the dataset to be employed in this instance.

16_CREATE_INSTANCE_PREVIEW_DATASET_SCHEMA

There is a preview view for each dataset. Here, users can get details about its schema.

 

17_CREATE_INSTANCE_DATA_MAPPING

Once the COP model and dataset have been selected, the user must map the dataset attributes to the COP variables. As explained in [1], there are three types of variables: IN, OTHER, and OUT. FABIOLA enables to perform this napping in a drag-and-drop basis.

 

18_CREATE_INSTANCE_SYSTEM_CONFIGURATION

The last step before creating the instance, is to select the desired configuration parameters for the cluster.

 

Querying results

19_RESULTS_VIEW_TABLE_SELECT_COLUMNS

This is an example of the querying tool of FABIOLA. In this view, user can select the fields to query. Output fields are those generated by the COP and defined by the user during the data mapping process.

 

20_RESULTS_VIEW_TABLE_QUERY

Example of a tabular view.

 

Screenshot from 2018-05-30 16-36-26

FABIOLA also supports aggregation operators. User can choose data attributes and operators, and depict the aggregated data in tables or maps.

 

Results

The proposal was tested in an industrial scenario. Several Spanish electricity companies wanted to get the optimal power which each of their customers might contract in order to minimise their consumption. Our study [1] demonstrated that distributed COPs dastrically improved the global execution time. Including more worker nodes  might improve the performance.

[1] Valencia-Parra, Á., Varela-Vaca, Á. J., Parody, L., & Gómez-López, M. T. (2020). Unleashing Constraint Optimisation Problem Solving in Big Data Environments. Journal of Computational Science, 45, 101180. https://doi.org/10.1016/j.jocs.2020.101180

Source code

  • FABIOLA Backend: GitHub repository.
  • FABIOLA User Interface: GitHub repository