sphinx-project.eu / Blog  / The log system for security events provided in SPHINX Toolkit

The log system for security events provided in SPHINX Toolkit

A Security Information and Event Management (SIEM) system is a management security approach that provides a comprehensive view of an organization’s security information system. Furthermore, these systems are an important component of company networks, IT infrastructures and the cybersecurity domain. In fact, they allow to consolidate and to evaluate messages and alerts of individual components of an IT system.

At the same time messages of specialized security systems (firewall-logs, VPN gateways etc.) can be considered. However, practice has shown that these SIEM systems are extremely complex and only operable with large personnel effort. Many times, SIEM systems are installed but neglected in continuing operation.

The SIEM being developed for SPHINX aims to mitigate those flaws present in other SIEM systems, providing a practical tool that seamlessly integrates with as many components as necessary, and offers a user-friendly interface to manage technical and analytical aspects of the entire system. The complete architecture of the component is presented in the following design.

The SIEM component in SPHINX is, so far, subdivided into 3 main components: Document Manager, Event Manager and Metano Query Language (MQL), where each of them is responsible for crucial process feature of a SIEM system, namely data normalization, event management and text search, respectively. They have been designed this way to segment the project structure and improve component development, scalability and maintainability as a whole. The following subtopics present a more in-depth description of each one of them.

Document Manager

The Document Manager subcomponent is responsible for controlling the entry and search of all logs, providing input methods for file upload, file and/or directory monitoring, and TCP and HTTP connections.

Event Manager

The Event Manager subcomponent is responsible for gathering all the queries that are going to run periodically, in order to monitor specific events described in the scheduled query, generate alerts and take actions when abnormal operations are detected. Additionally, this subcomponent saves all query executions and alerts to a database, and all saved data is available through the exposed API listing methods.

MQL – Metano Query Language

The MQL component is also the Search Engine of the SPHINX SIEM system. It shares its name with the language used to build the search queries. Furthermore, MQL is inspired by CAL (internal PDM language), SPL (Splunk search processing language) and KQL (Kusto Query Language) and aims to be a high-performing query language for log files. This component has only one method, called Run, which is used to execute the query. This method is divided into four (4) major steps, namely: Interpretation, used for interpreting the query, Get settings, used for loading data sources settings and configurations, Fetch data, used for querying data in the selected data source and finally, Transform data, used for parsing the data based on the criteria informed on the query. In addition, the next lines will give a bit more detail on each of the steps.

  • Interpretation: In this step, MQL extracts from the query all the information necessary for performing the search, such as data source settings, fields required in the response, filter conditions and even internal methods necessary for filtering and/or transforming the data.
  • Get settings: From the previous stage, the workflow has the all of the information for connecting to the chosen data source. If in the query the user did not provide the data source settings, MQL searches an internal table with all previously registered data sources and if it is not found, a default database is used.
  • Fetch data: After being ready to connect to the selected database, the next step is to execute the query search on the selected database and retrieve the results.
  • Transform data: Finally, in this step, any needed transformation is applied to the retrieved results. It could be a simple filter on the extracted data, as well as an extraction on read, which means extracting new fields from the existing ones using regular expressions. In summary, it can transform the originally-retrieved data, based on what was defined in the search query.

The main objective behind SPHINX SIEM component is to provide a solid log management tool for security-related events that the centre administrators can rely on for responding to incidents as early as possible. Thus, the SIEM ecosystem will encompass log collection and normalization, data correlation, alert management and reporting, in order to guarantee near real-time analysis of security alerts, which have been generated by network hardware and applications.

More information about the SIEM system can be found at Deliverable 4.5 that is publicly available here.