Van Excel file naar serverpark datagedreven onderzoek naar laadinfrastructuur in Nderland JR Helmus
2 @JRHelmus / 38 yr / Father / Fiat X1/9 / Innovator / Amersfoort Phd Researcher computational science / coordinator minor data science /
The Netherlands is a lab for public E-mobility Main problem to solve: optimization of charging infrastructure usage and rollout 3
Challenges for policy makers From interviews with stakeholders within the EV ecosystem it appeared that their main problems are as follows: 1. Stakeholders are in need of a set of performance metrics that allow to balance supply and demand. They have limited insight in how typical behavior influences the performance of the system 2. Stakeholders need of insight in interventionsto optimize the efficient use of existing resources 3. Whilst being professionals in the EV work field, there is limited insight in the future state of the system given expected behavioral changes due to e.g.: (i) new EV models with latest technology (larger battery size) (ii) adoption of early majority (new expectations) (iii) new user types (freight, taxis) (iv) non-linear increase of user-user interactions 4. Particularly municipalities are searching for an effective deployment strategy that takes into account non-trivial effects that complex systems display J R Helmus, R van den Hoed, and S Maase. RAAK-PRO Projectvoorstel IDOLaad. Technical report, Amsterdam, 2014
The dataset ~9.7M transactions from 120K users, ~7k charging stations for this research is unique in the world Parameter Example Explanation Charge point address Charge point operator Charging service provider Charge point city Charge point postal code Admiralengracht 44 Adress of the charge point Nuon Essent Amsterdam 1057EW Owner of the charge point Owner of the used charging card ZIP code of the area of the charge point Volume 0,86 Charged energy [kwh] Connection 0:14:23 Time the car was connected time Start Date 18-04-2012 Date the session started End Date 18-04-2012 Date the session ended Start Time 23:20:55 Time the session started End Time 23:35:18 Time the session ended Charging time 0:14:23 Time the car is actually charging RFID 60DF4D78 RFID code of a charging card Participating geographic areas - Amsterdam /Rotterdam /the Hague /Utrecht - Metropool Regio Amsterdam (from Haarlem to Amersfoort) - Evnet in whole of Netherlands Deeper insight - Specific charging cards for Taxi, Car2Go, Entrepreneurs, RDW, - Implemented interventions in G4 (Parking) and EVNET (price) - Contextual (open) data Timing spectrum - From real-time to monthly from first charging point 2012 to 5 min ago RFID Charge point address From vd. Hoed, Helmus, Bardok (2014) Charge volume Connection time
It started with Charging Detail Records in Excel files
Out of 380,000 records a subset of ~250,000 was suitable for analysis 450000 Causes of ~35% data removal per error type 400000 350000 300000 250000 200000 150000 100000 50000 0 Total before cleansing Connection time repair Physicly impossible charge sessions Unknown data Double records Short time Double provider Net usable records Data filtering and smart algorithms required for quality of data analysis.
A complex algorithm for repairing short time charge sessions required 16 hours of calculation on a laptop Example of short time algorithm total charge sesion Session 1 Session 2 Session 3 The crawling algorithm checks on adjacent short times. Session 4 Session 5 Session 6 Session 7 Session 8 The algorithm influences the # charge sessions as well, and thus the mean session duration. Session 9 Source: Charge infrastructure forecast database Note: Our hypothesis is that loose cable connections and information transfer issues cause this problem.
Acces security SQLs allows to protect data by the user pwds From Access to our first SQL Server environment for storing data centrally, yet this did not solve some fundamental problems Server side cleaning Allows to do long and continous cleaning processes Homogenenous data Allows to provide data in same format to all users Server side reporting Allows to setup server Side Reporting Services 1 gebruiker, 1MB data, 1 database SQL server MS access 10 gebruikers, 10GB data, 1 database SQL server Applicatie server MS access User side analysis Analysis was performed at user side (ms Access) so (1) computational power limited (2) Memory limited (3)security issues
We needed a DWH structure to share and store data Performance improvements DWHs structures are known to be faster. SQL clarity Simple DWH is more easy to understand than complex ERDs Information uniformity Part of DWH processes is to have a uniform understanding of the same data DWH expertise and maintanance You need DWH experts to Setup and transalion to DWH is not easy task New montly data needs to be transformed to DWH
The IDOLAAD subsidy program (4 years) supports this research and requires deliverables as well Program goal: develop insights that enable stakeholders within EV value chain to have a effective rollout and efficient use of charging infrastructure Working packages Research team and participants Research team - 1 lector - 3 Post doc senior researchers - 2 PhD students - 3 junior researchers (Applied mathematicians) - 3 IT specialists - Computational science lab Consortium participants (deliver data and cases and work) - 4 largest cities and areas surrounding - 3 charging infrastructure providers (delivering data) - 1 charging point producer - 1 network grid provider - 1 consultancy firm (specialized in EV interventions) From IDOLAAD proposal, Hoed, Maasse, Helmus (2014) Expectations - Prediction model for local demand (not this PhD study) - Simulation model for optimizing charging infrastructure usage and rollout (as always ASAP)
The research project supports policy makers Monitoring performance Predictive analytics Modelling & simulation
The dataset ~10.7M transactions from 140K users, ~7k charging stations for this research is unique in the world Parameter Example Explanation Charge point address Charge point operator Charging service provider Charge point city Charge point postal code Admiralengracht 44 Adress of the charge point Nuon Essent Amsterdam 1057EW Owner of the charge point Owner of the used charging card ZIP code of the area of the charge point Volume 0,86 Charged energy [kwh] Connection 0:14:23 Time the car was connected time Start Date 18-04-2012 Date the session started End Date 18-04-2012 Date the session ended Start Time 23:20:55 Time the session started End Time 23:35:18 Time the session ended Charging time 0:14:23 Time the car is actually charging RFID 60DF4D78 RFID code of a charging card Participating geographic areas - Amsterdam /Rotterdam /the Hague /Utrecht - Metropool Regio Amsterdam (from Haarlem to Amersfoort) - Evnet in whole of Netherlands Deeper insight - Specific charging cards for Taxi, Car2Go, Entrepreneurs, RDW, - Implemented interventions in G4 (Parking) and EVNET (price) - Contextual (open) data Timing spectrum - From real-time to monthly from first charging point 2012 to 5 min ago RFID Charge point address From vd. Hoed, Helmus, Bardok (2014) Charge volume Connection time
With the charging data as central dataset, the database is continuously expanded, extended and enriched and scraped Data Expansion Data Extension Data enrichment Data Scraping OCPI
General methodology of IDOLAAD research
4933 charging sessions in een test environment De data in Idolaad is exceptional In current state of literature there is no such dataset for scientific usage in dense metropolean areas public charging infrastructure. 10 maanden data 10 EV drivers No EV data available GPS location of non EV users 36 mln km s / 28,000 vehicles 0 EV s
From Data Science challenges to required IT infrastructure Typical properties of data science tasks Data redundancy typically you start with lots of data and directly thereafter the carbage is filtered out Long calculation times for Single CPU Most data scientists use 1 core per analysis, except for specific Machine learning algorithms that are able to use multiple cores In RAM calculations typically calculations are performed in RAM memory and not on hard drives. Computer RAM memory should not be a bottlneck Long idle times between analyses users tend to have long idle times between two runs of analysis. This si typically used for looking at the results Requirements for the ideal IT infrastructure Seperation of functions We needed a seperation between storage, database, analysis server and deployment of results. No data on personal computers all calculations must be made server side without the ability to download data Computational requirements may not be a bottleneck for research Memory, storage and CPUs may not put research in a queue. Threefold User specific security users should logon to access the network, to access the server and to access the data. Shared resources sharing computional resources (RAM/HD/CPU) is preferred over individual resources Invisible / unreachable for the outside world VPN and professional firewalls needed to allow access to servers Scalable future proof IT infrastructure invest and implement now for several years
IT infrastructure for data science: Security & computational resources may not be a bottleneck during the minor Scalable computational server
This shows the typical use of our servers CPU usage Typically short peaks when doing algorithms. This means a few CPUs can host many users Memory usage Peaks in memory point at non filtered data and running algorithms
So we reshuffeled our organization to align it for data intensive science research
Shiny server
SEVA is a data driven charging behavior focused modular simulation model for EV user activities Data driven Charging behavior Modular SEVA is fully data driven and data validated 4.5M charging transactions form 2013-2017 Trainingset 2014-2016 Tested and validated on 2017 Charging behavior dimensions WHEN start/stop connection WHERE clusters of activity patterns WHAT connection times and kwh LONG longitudinal patterns Different switchable modules: Car sharing EV Uptake Policies
Charging behavior of agents is modelled by activity patterns, geospatial clusters of activities and discrete choice modelling Activity patterns Geospatial clustering of destinations Discrete choice modelling Distributions of activities are used to simulate the start and stop connection events and time between these Destinations are estimated by the weigthed mean lon/lat over all time related sessions Discrete choice model per user based on environmental properties such as distance, costs (parking+kwh), speed option option center option
SEVA is validated on charging data Validation and sensitivity analysis are extremely important for agent based models. SEVA is based on the following approach: Agents are created based on a training set that allows to create activity patterns After simulation for each user the simulated pattern is compared to the existing pattern in the test set. A validation value is developed to create a single value metric for each simulation. For each parameter a simulation was performed to check on improved performance of validation metric
SEVA was used to simulate future challenges 1. Effects Car sharing on the charging infrastructure network 2. PHEV to FEV transition; future EV charging by looking into the differences in charging behavior due to battery size and car type. 3. Test future proof rollout strategies by their effect of failed sessions Car sharing programs PHEV to Large BEV Transition Future proof rollout strategies
Challenge: Car sharing in cities Car sharing programs may have unexpected effects on EV user convenience Car sharing agent properties: multiple sessions per car per day randomly distributes over time and space Base case: current user population Experiments: increase # non-habitual users per EV user Analysis: (i) (ii) effect of failed connection attempts Charging infrastructure vulnerability* *Paper on TRA 2018, "Vulnerability Of Charging Infrastructure, A Novel Approach For Improving Charging Station Deployment Glombek / Helmus
Challenge: Transition PHEV to FEV Now that early lease contracts are ending and that large battery FEVs are entering the market, the question is how will future charging look like? Base case: Total population consisting of PHEV and FEV users. Experiments: Change (part of) the total population to Low Battery FEV and / or High Battery FEV. Analysis: Influence on the charging infrastructure in terms of (per time unit per pole): (i) Hours connected (ii) kwh charged (iii) Unique users
PHEV transition: simulation results The graphs show effects of the probability of users switching from PHEV to FEV on KPIs: less sessions will occur at a charging point when more users switch to FEV and mean number of different users per day decreases due to the fact that users will charge les frequent. Connection duration will decrease while total kwh per session increases. This leads to the conclusion that (i) that the total efficiency of charging infrastructure will increase due to the transition from PHEV to FEV (ii) There will be less room for smart charging Nr weekly sessions Mean connection duration in days Mean total Kwh per session per week
Challenge: Future proof rollout strategies The challenge is to have a rollout strategy that generates most uses convenience for least cost and highest number of users. User convenience is measured in number of failed connection attempts Base case: current user population population Experiments: extension of current sockets by (i) (i) (ii) (iii) (iv) Analysis: (i) (ii) Random selection of charging locations Increase at high # unique users Increase at high performing CPs (in kwh) Increase on vulnerable CPs effect of failed connection attempts Charging infrastructure vulnerability*
Conclusion and future work on SEVA Conclusion SEVA is a data driven and validated agent based model that allows to work on challenges for policy makers SEVA s purpose can be adopted for specific challenges by adding new modules SEVA s modularity and focus makes it a lightweight interactable model Future work We are working on fast charging behavior module for inner city placement of fast chargers We are working EV uptake model for local uptake of EV users We are looking for cases to be simulated for policy makers
Pijler 3 Mircoscopic traffic model 750,000 Systen Billing Units (Core hours) spend at LISA
ANY QUESTIONS