Geotab Ace: Revolutionizing Fleet Management with Responsible Generative AI
Published on December 2, 2024
Table of Contents
Introduction
Geotab continues to invest in the latest technologies that enhance data intelligence for our customers. Generative AI (Gen AI) revolutionizes the way fleet data can be utilized by democratizing access to the data to achieve business outcomes, through the use of interactive natural language queries. These queries can vary from simple to complex, and have a wide range of scope including fleet safety, electrification, maintenance, and so on.
We are excited to introduce our new solution, Geotab Ace™, an innovative AI assistant powered by Gen AI. Geotab Ace builds on the customer feedback we received during our 2023 beta proof of concept, Project G. Geotab has been a leader in the AI field for many years. With the introduction of Gen AI, we quickly recognized that there are many unparalleled opportunities and challenges introduced by technologies like ChatGPT. In fact, our work on Project G as well as our own internal Gen AI journey inspired us to publish a Generative AI maturity whitepaper to service other organizations who are also deploying Gen AI technologies.
This document aims to share our approach to responsible deployment of Gen AI within the context of Geotab Ace.
To level set on our general approach to privacy and security, we would like to direct the reader to our public policies that cover privacy, information security, etc.
In this document we focus on the unique challenges posed by this project and how we believe we overcame them. The document is divided into two sections, the first focuses on Responsible AI deployment, and the second focuses on Privacy. First, we begin with a brief summary of Geotab Ace, please note that more details can be found at geotab.com/ace.
Solution Overview
Geotab Ace is a natural language interface embedded into our telematics software, and is able to write code to safely query customer data. The goal of the product is to simplify and enrich the user experience. This interface is designed to enhance the accessibility of data analytics information and insights. The bottom line is that Geotab Ace will deliver insights that help support our customer investments in telematics.
To delve into some of the details of how this is made possible, our enterprise grade solution relies on a Large Language Model (LLM) provided by Google (Gemini). The model acts as an Agent that attempts to interpret user questions and then writes SQL code that securely queries the customer’s data. Geotab Ace then provides back the required data. The LLM also explains what it understood from the request, what assumptions it made and where it got the data to produce the output. In other words it breaks down the SQL query in plain language for everyone to understand. This gives full transparency so the users know what the model understood. The LLM itself does not generate the data and insights directly and it does not have access to the customer data. With the LLM being used as an Agent it writes the code to fetch the appropriate data, as opposed to attempting to ‘guess’. Further, the Agent has been "grounded" in a significant number of written queries produced by Subject Matter Experts andData Scientists on a significant number of query/response pairs to improve its accuracy and consistency.
This is all controlled by a proprietary orchestration (chaining) layer.
Now that we have provided a quick summary of Geotab Ace’s operation, we will now provide an overview of our Responsible AI deployment framework and our approach towards privacy.
Section 1: Overcoming Gen AI Challenges
In addition to the already known Responsible AI (mostly machine learning) deployment principles, Gen AI introduces additional challenges some of which are beyond the scope of the document. In the context of this project, we highlight the specific challenges faced by the team and how we overcame them using a multi-faceted solution. Specifically, our approach relied on:
- Human-centric design meaning that we have our end users in mind while designing our systems.
- As always, focusing on our high quality data collection
- Grounding the model in fact
- Red teaming or ethical hacking to stress test the models and
- Making sure we are deploying an enterprise grade solution.
The specific threat areas are summarized in the following table.
In the following we expand on the table to show some practical examples of how we infused Responsible AI into Geotab Ace:
Human-Centric Design
Geotab Ace is developed with an aim to democratize access to data, so it is built as a virtual assistant with the human decision maker in the loop. The system does not make decisions nor take actions on its own. It was developed by working directly with Geotab customers using our Project G POC to solicit feedback to improve our customer experience. Also, we routinely involved many Geotab team members in the development and testing process to make sure we are approaching this with the right diversity of thought.
We have also included a feedback option directly in the interface, users can report content, suggestions or issues that are perceived as incorrect or harmful. They can provide suggestions, and share experiences.
For Geotab Ace, we also emphasized explainability in our user experience providing the users with insights into how the system operates and reasons through the customer questions. The LLM deconstructs the query to provide full transparency in what the model understood from the user request. Therefore, all the steps taken by the LLM are made available, and the SQL code used by the system to obtain insights from Geotab systems is provided to the user for clarity and can be used by data professionals to ensure that the model is performing to expectations and for troubleshooting or feedback purposes.
High Quality Data
To ensure high quality data outputs from the system, we must start with quality data. Geotab has spent many years investing in high quality data collection to improve decision making using AI. For example, our focus on high fidelity curve logging for data collection. In our Data Decoded video series,we highlighted how quality ingredients power quality AI.
Grounding
To ensure the model’s performance we conduct extensive regression testing on potential prompts and resultant data returned from the database. We (Subject Matter Experts + Data Scientists) created an extensive database of question answer pairs to increase the accuracy of the system. Through rigorous testing we were able to assess the performance of the system to questions when compared with the correct responses.
Ethical Hacking
To overcome the unique challenges of Gen AI products we stay up to date on the most relevant research and apply red teaming tests. The tests aim to ensure the system responds appropriately on a wide range of trustworthiness, data privacy, and security performance tests. We have developed a specific set of prompts to test risks that are specific to the product and data sources including sensitive behavior profiling, solicitation of PII and decision making related to employment decisions
The following figures show some examples:
Figure: Example of Red-teaming questions and refusal responses from the system
Enterprise Grade Deployment Framework
Geotab relies on its existing enterprise grade technology investments in our product and solution architecture. Further, the LLM is deployed using Google Vertex AI with industry leading security standards. These details are outside the scope of this document, but some elements are highlighted in the privacy framework.
Furthermore, Geotab has put in place a legal framework to guarantee that using this product does not impact ownership of customer data. The data remains the property of the customer at all times and is never used to train the underlying LLM. Also, any insights produced by the system remain the property of the customer.
Secure Querying
Further we adopt secure querying through several layers of safety.
First, the model is never given any knowledge of the actual data location, or overall database structure. Examples are provided to the LLM with placeholder values. Any unknown placeholders which appear, either through model hallucinations or attempted injection attacks, will cause the chain to immediately fail.
Secondly, all data access is controlled through read-only permissions to the underlying data, so it can never be modified by the model.
Finally, we have an additional “smart” security layer which checks all incoming queries for known injection and other attack patterns and causes the chain to terminate early if any are found. To prevent any chance of the model seeing the result of a query, successful query execution itself causes termination of the Agent chain.
In the following section we further explore our privacy framework.
Section 2: Privacy
In this section, we highlight our privacy approach against industry standards on data protection.
Purpose Limitation and Data Minimization
Geotab Ace is using existing customer data in addition to insights provided by Geotab systems which are also available to the customer. As with any system there is a risk of users entering personal information or any other confidential data into the AI system via prompt. To mitigate this risk, we actively guide and educate users to refrain from inputting sensitive or irrelevant data.
From a data minimization perspective, we do not store returned data from the databases to minimize the chance of storing any personal information present in the customer’s fleet data being stored in chat history data. No user will be able to access another customer’s chat history data.
Data Protection
In addition to our advice to avoid providing personal information in the system for data privacy, we de-identify any Personal Identifiable Information (PII) in a stored prompt and obscure it before it is used for any product improvements. We design and deliver our products in such a manner that the data available to Gen AI systems is restricted according to a user’s own privileges .
Further, the results coming back from Geotab databases are not provided to the AI agent, but are only returned directly to the user.
This is designed to ensure that any data passed into our system can not be accessed by unauthorized users and is not available to our model either during user interactions or embedded via training of the models or in future product development activities.
No Customer Data sharing with Third-Party
No telematics data is shared with Google, only the contents of the question/query will be shared. Currently, we are utilizing the advanced capabilities of Vertex AI services. Customer data does not leave the secured confines of the existing Geotab data storage environment.
Data Access Controls
The chat model’s access to internal databases and chat history is consistent with a user’s product access for a seamless end to end solution for data access control. To confirm access control is maintained to the appropriate level it is subject to penetration testing during the development phase and ongoing unit testing is integrated into the development process.
Conclusion
As this document has shown, we believe that Gen AI acts as a great enabler for data democratization and creating insights for our customers. With our innovative and multi-faceted approach to Gen AI deployment we help our users extract the maximum value from the system while balancing the safety of these complex systems.
To schedule a demo, visit: geotab.com/ace.
About Geotab
Geotab, the global leader in connected vehicle and asset solutions, leverages advanced data analytics and AI to enhance fleet performance, safety, and sustainability while optimizing costs. Backed by a team of industry leading data scientists, engineers and AI experts, we serve over 50,000 customers across 160 countries, processing billions of data points hourly from more than 4 million vehicles. Data security and privacy are at the forefront of all we do—trusted by Fortune 500 organizations and some of the largest public sector fleets in the world, we meet top cybersecurity standards. Geotab’s open platform and diverse Geotab Marketplace offers hundreds of fleet-ready third-party solutions. Learn more at www.geotab.com and follow us on LinkedIn or visit Geotab News and Views.
© 2024 Geotab Inc.All Rights Reserved.
This white paper is intended to provide information and encourage discussion on topics of interest to the telematics community. Geotab is not providing technical, professional or legal advice through this white paper. While every effort has been made to ensure that the information in this white paper is timely and accurate, errors and omissions may occur, and the information presented here may become out-of-date with the passage of time.
Recent News
Mining value from connected vehicle data: How OEMs can lead the next-gen of data-enabled services
November 11, 2021
When and how to expand telematics coverage in off-road fleets
May 21, 2021
Preserving privacy and security in the connected vehicle: The OBD port on the road ahead
December 3, 2015
Sustainable fleet management: data-driven insights to enhance the bottom line
July 2, 2024