Amazon recently announced that they have released the AWS IoT Analytics service. This is a general availability release, which at the moment is available only in Ireland, Ohio, Oregon and North Virginia. In a nutshell AWS IoT Analytics is a service that is fully managed and provides analyses of all the data that gets collected from your IoT devices.
Benefits of using AWS IoT Analytics
The Analytics service gives users the ability to not only collect and store that data, but also to query it and process messages. Data is able to be conveniently visualised by using Amazon Quicksight, and Jupyter Notebooks are able to integrate with it for machine learning.
Following the initial preview the AWS team has further expanded the features of the service by including a BatchPutMessage API to utilise external sources to input data. And the SampeChannelData API which enhances the manipulation of stored data, reprocessing of data, and the ability to preview messages and pipeline results.
There are many benefits of using this service including:
- predictive analysis of data
- visualisation of output
- ways of data cleansing
- helps with pattern identification of collated data
Some Key Ideas about AWS IoT Analytics
The IoT Analytics service offers a lot of sophisticated concepts which can be unpacked for simplicity’s sake. Customers that are focused on data prep will use Data Stores, Channels and Pipelines. Customers focusing on data analysing can use Notebooks and Datasets. We will briefly expand on each of these concepts.
These are data storage options that are queryable and used to output pipelines. If a query is made then the result is contained in a Dataset and storage periods can be customised to minimise the costs. There is the ability to filter many datastores that come in from multiple sources simultaneously.
Channels are scalable and are data that has been injected from external inputs using the Ingestion API to do so. Data is formatted into Binary or JSON, and if need be alternative logic can be used to reprocess the raw data. Channels archive the raw messages and gather data from MQTT.
The messages that are sent via channels are consumed by Pipelines and then users are able to process them. There are a variety of ways that users can process them such as setting filters on attributes, using external sources to add data, using lambda functions, and remove or add fields.
Data from Pipelines move on to a Data Store. Basically pipelines consume the messages brought in by the channel and allows them to be processed.
If you are familiar with SQL databases then a dataset is similar to a view. Datasets are created when a user creates a query with a data store. This can be done on a one off manual basis or set to run automatically on a recurring basis.
These are Jupyter notebooks that host Amazon SageMaker to allow users to create custom code to analyse data, and ML models.
It is a fairly straight forward process for setting up IoT analytics with this new AWS service and it expands on what can be done with the data that gets collected.
You could be interested in this ebook: