background
in ADS, data fusion, sensor performance, L3+ perception, localization algorithms development relys a lot on physicall data collection, commonly in rosbag format with information/data about gps, rtk, camera, Lidar, radar e.t.c.
to build up the development process systemly is a critial thing, but also ignored by most ADS teams. for large OEMs, each section may have their own test vehicles, e.g. data fusion team, sensor team e.t.c, but few of them take care data systematically, or build a solution to manage data. one reason is the engineers are lack of ability to give software tool feedbacks/requirements, so they still stay and survive with folders or Excel management, which is definitely not acceptable and scalable for massive product team.
thanks for ROS open source community conributing a great rosbag database manage tool: bag_database. with docker installation, this tool is really easy to configure. a few tips:
- web server IP port in Docker can be accessed from LAN by parameter
-p
during docker run.
mount sbm drive
as mentioned, most already collected rosbag is stored in a share drive, one way is to mount these data.
|
|
Tomcat server configure
bag_database is hosted by Tomcat, the default port is 8080. For our services, which already host pgAdmin4 for map group; gitlab for team work; xml server for system engineering; for webviz. check the port is occupied or not:
|
|
so configure /usr/loca/tomcat/conf/server.xml
:
|
|
a few other tools
ros_hadoop
ros_hadoop is a rosbag analysis tool based on hdfs data management. which is a more scalable platform for massive ADS data requirements. if there is massive ros bag process in need, ros_hadoop should be a great tool. there is a discussion in ros wiki
install hadoop
concept in hdfs
- namenode
daemon process, used to manage file system
- datanode
used to data block store and query
- secondary namenode
used to backup
hdfs-site.xml
/usr/local/hadoop/etc/hadoop/hdfs-site.xml
- dfs.datanode.data.dir -> local file system where to store data blocks on DataNodes
- dfs.replicaiton -> num of replicated datablocks for protecting data
- dfs.namenode.https-address -> location for NameNode URL
- dfs.https.port ->
copy local data into hdfs
hdfs dfs -put /your/local/file/or/folder [hdfs default data dir]
hdfs dfs -ls
mongodb_store
mongodb_store is a tool to store and analysis ROS systems. also in ros wiki
mongo_ros
mongo_ros used to store ROS message n MongoDB, with C++ and Python clients.
mongo-hadoop
mongo-hadoop allows MongoDB to be used as an input source or output destination for Hadoop taskes.
ros_pandas
tabbles
tabbles used to tag any rosbags or folders.
hdfs_fdw
at the end
talked with a friend from DiDi software team, most big Internet companies have their own software tool teams in house, which as I know so far, doesn’t exist in any traditional OEMs. is there a need for tool team in OEMs? the common sense is at the early stage, there is no need to develop and maintain in-house tools, the commericial ones should be more efficient; as the department grows bigger and requires more user special development and commericial tools doesn’t meet the needs any more, tool teams may come out. still most decided by the industry culture, OEM’s needs is often pre-defined by big suppliers, so before OEMs are clear their software tool requirements/need, the suppliers already have the solutions there – this situation is epecially true for Chinese OEMs, as their steps is behind Europen suppliers maybe 50 years.\
I am interested at the bussiness model of autovia.ai, which focus on distributed machine learning
and sensor data analytics
in cloud with petabyte of data
, with the following skills:
large scale sensor data(rosbag) processing with Apache Spark
large scale sensro data(rosbag) training with TensorFlow
parallel processing with fast serialization between nodes and clusters
hdmap generation tool in cloud
metrics and visulization in web
loading data directly from hdfs, Amazon S3
all these functions will be a neccessary for a full-stack ADS team in future to development safety products, which I called “ data infrastructure for ADS”.
refer
MooreMike: ros analysis in Jupter