Predicting Insurance Risk

Risk Brackets and Data

Insurance companies spend a large part of their effort on calculating the rates to give each customer based on their insurance needs. A key component of this effort is calculating the customer’s risk factor. Because different people may use their insurance plans more or less than others, insurance companies calculate each customer’s risk and modify their rates accordingly. This means that when two people both want plan “A” for insurance, the company looks at their driving history, accidents, and other factors to determine their “premiums” or how much extra they need to pay to offset their risks in addition to the amount they would pay to cover the plan’s cost.

OneClick.ai’s model’s make this difficult process significantly easier by removing the need for feature engineering. Rather than having to process and tune data to make it reflective of the past, often removing key features in the process, data can just be uploaded in its rawest form. OneClick.ai’s AI platform will automatically determine the most important and relevant features and tool its models accordingly. This greatly reduces the workload on the underwriting team and avoids possible errors that may occur from the feature engineering process.

Predicting Risk with Time-Series Classification

Most insurance companies renew a customer’s insurance on a yearly basis, though not necessarily at the same time in every year. In our example, we are a car insurance company that wants to accurately classify customers into “Risk Brackets” that tell us how much extra to charge them for their coverage based on how likely they are to use their insurance. We are currently in the year of 2017 and want to know their risk bracket for 2018. To use time-series classification to predict their risk bracket we will need to train a model using a time-series dataset going from as far back as we have records. We do not include 2017’s data for training the model as that year is to be reserved for after the model is trained. Our data set used to train our model will look like this:



Our table contains basic customer data like the number of times they used their insurance, their total claim amounts, their credit score, and age. These are then aggregated each year to give a yearly total and which risk bracket was most optimal for that year is shown in the last column. There are three customers whose data was recorded up till the period where we stopped recording (2016 in this example). Notice how each customer does not have their insurance start in the same month. Because there is no set day insurance is renewed when a person starts their new coverage will vary year to year.

The optimal brackets reported in the training data do not need to be the same brackets the customers were actually assigned in that year, but rather the best fit brackets they should of been in now that all their data is known. This way, the model is trained to predict the most optimal risk bracket rather than a real-world one with possible error.

This data will be used to train a classification model that will predict what risk bracket the customer should be put into for next year. Each prediction will be informed based on last year’s bracket choices as well as the customer’s past data. Once the model is trained it can be applied to our 2017 data to predict the risk brackets for 2018:


Notice how these are customers who will be renewing their insurance in June of 2017, each of their 2017 data has been recorded, but their risk bracket has been left blank as it has not been decided, and it is what we are trying to predict. The trained model will now look at this data and assign each customer to a risk bracket based on not only the current row but also all the historical data it has observed. Its predictions will be in the same format used to train it, (numeric classes) and give confidence levels (given as percentages) to inform you how confident its choice is.

New customers who were not originally part of the training data need to be included in the historical data with all previous customers and a new model needs to be trained before predicting the class of the new customer. This is to ensure that the model remains as accurate as possible.

After training the model it can be given customer’s data for each month of the current year and predict their most optimal bracket to place them into until the time comes to renew their insurance next year. Our insurance company can now go month by month throughout the year of 2017 assigning each month’s batch of customers with assigned risk brackets, updating the model with new customer’s data each time.

Schedule a Demo

Tags: No tags

Comments are closed.