Open Source Licenses in Machine Learning and Self-driving Car Projects

Joshua Owoyemi
6 min readNov 7, 2017

Since I finished the Udacity Self-driving engineer course, I have been interested in investigating some self-driving car open source projects that are available online. My aim is to see which one I could contribute to, or at some point start something on my own. At this point, some important questions for me are; How much of open source projects is really open? How much of it can you use for your own projects, and basically what kinds of license back these open source projects? It is necessary (or maybe noble) to give appropriate attributions to the rightful owners of copied works. This could get complex along the way, but I am interested, at this point, to know the licenses used in the open source projects that I might find useful.

In this post, I will like to share my summary of the various open source licenses and then go ahead to compare few popular open source projects. I will be limiting my scope to only machine learning and self-driving cars because these are areas that are interesting to me right now. Hopefully, you might find this information useful.

Popular open source licenses

The following is a summary of licenses on open source software projects according to choosealicense.com

  • The MIT License is a permissive license that is short and to the point. It lets people do anything they want with your code as long as they provide attribution back to you and don’t hold you liable. The conditions only requiring preservation of copyright and license notices. Licensed works, modifications and larger works may be distributed under different terms and without source code.
  • The Apache License 2.0 is a permissive license similar to the MIT License, but also provides an express grant of patent rights from contributors to users. The conditions require preservation of copyright and license notices. Contributors provide an express grant of patent rights. Licensed works, modifications and larger works may be distributed under different terms and without source code.
  • The GNU GPLv3 is a copyleft license that requires anyone who distributes your code or a derivative work to make the source available under the same terms, and also provides an express grant of patent rights from contributors to users. Bash, GIMP, and Privacy Badger use the GNU GPLv3. It is conditioned on making available complete source code of licensed works and modifications, which include larger works using a licensed work, under the same license. Copyright and license notices must be preserved. Contributors provide an express grant of patent rights.

There is also (According to opensource.org);

  • The 3-Clause BSD License (The New BSD License): This is a part of the family of permissive free software (BSD) Licenses, imposing minimal restrictions on the use and redistribution of covered software. This is in contrast to copyleft licenses, which have reciprocity share-alike requirements. See the other simpler version, The 2-Clause BSD License here. The primary difference from the simpler version is a non-endorsement clause which is omitted, and a further disclaimer about views and opinions expressed in the software.

Popular frameworks and projects, and their respective licenses:

The following are currently the most popular machine learning frameworks, by the number of stars and forks on their github repositories.

  • Google’s Tensorflow, 76,615 Stars, 37,798 Forks: An open source software library for numerical computation using data flow graphs. The graph nodes represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) that flow between them. This flexible architecture lets you deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device without rewriting code. TensorFlow also includes TensorBoard, a data visualization toolkit. It is licensed under The Apache License 2.0.
  • The Berkeley Vision and Learning Center (BVLC)’s Caffe, 21,160 Stars, 12,981 Forks: A deep learning framework made with expression, speed, and modularity in mind. It is developed by Berkeley AI Research (BAIR)/BVLC and community contributors. Uses a shared copyright model where each contributor holds copyright over their contributions. The project versioning records all such contribution and copyright details.
  • François Chollet’s Keras, 21,440 Stars, 7,828 Forks: A high-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, or Theano. It was developed with a focus on enabling fast experimentation. Being able to go from idea to result with the least possible delay is key to doing good research. It is licensed under The MIT License.
  • Microsoft CNTK, 12,985 Stars, 3,379 Forks: A unified deep-learning toolkit that describes neural networks as a series of computational steps via a directed graph. In this directed graph, leaf nodes represent input values or network parameters, while other nodes represent matrix operations upon their inputs. CNTK allows to easily realize and combine popular model types such as feed-forward DNNs, convolutional nets (CNNs), and recurrent networks (RNNs/LSTMs). It implements stochastic gradient descent (SGD, error backpropagation) learning with automatic differentiation and parallelization across multiple GPUs and servers. The license is not directly stated but looks very similar to The MIT License, owned by The Microsoft Corporation.
  • Torch, 7,340 Stars, 2,192 Forks: The main package in Torch7 where data structures for multi-dimensional tensors and mathematical operations over these are defined. Additionally, it provides many utilities for accessing files, serializing objects of arbitrary types and other useful utilities. The license is not directly stated but looks very similar to the 3-Clause BSD License.

The last one here, though not in exactly the same category as the rest but indeed very useful and popular for general machine learning applications.

  • scikit-learn, 22,816 Stars, 12,043 Forks. A simple and efficient tool for data mining and data analysis, accessible to everybody, and reusable in various contexts, built on NumPy, SciPy, and matplotlib. It is licensed under The New BSD License (The 3-Clause BSD).

Self-Driving Cars Open Source Projects, and their respective licenses:

The following are currently the most popular open source self-driving car projects, by the number of stars and forks on their github repositories. I have not done any deep investigation of the extent of the open source of these projects. These projects simply have “open source” in their titles or description. Feel free to investigate on your own.

  • Commaai’s Openpilot, 6,572 Stars, 1,498 Forks: An open source driving agent. Currently it performs the functions of Adaptive Cruise Control (ACC) and Lane Keeping Assist System (LKAS) for Hondas and Acuras. Claims to be on par with Tesla Autopilot at launch, and better than all other manufacturers. It licensed under The MIT License.
  • Baidu’s Apollo, 6,395 Stars, 1,389 Forks: An open autonomous driving platform. It is a high performance flexible architecture which supports fully autonomous driving capabilities. Licensed under The Apache License 2.0.
  • Udacity self-driving-car, 3,323 Stars, 1,020 Forks: Includes the codes and software base for the Self-Driving Car Nanodegree program aimed at teaching the world how to build autonomous vehicles. The codes are made for a 2016 Lincoln MKZ, with 2 Velodyne VLP-16 LiDARs, 1 Delphi radar, 3 Point Grey Blackfly cameras, an Xsens IMU, an ECU, a power distribution system, and more. Datasets are released under MIT, everything else is GPLv3.
  • CPFL’s Autoware, 1,302 Stars, 625 Forks: Open-source software for urban autonomous driving, maintained by Tier IV. The following functions are supported; 3D Localization, 3D Mapping, Path Planning, Path Following, Accel/Brake/Steering Control, Data Logging, Car/Pedestrian/Object Detection, Traffic Signal Detection, Traffic Light Recognition, Lane Detection, Object Tracking, Sensor Calibration, Sensor Fusion, Cloud-oriented Maps, Connected Automation, Smartphone Navigation, Software Simulation and Virtual Reality. Licensed under the New BSD License (3-Clause BSD).
  • PolySync OSCC, 572 Stars, 150 Forks: Open Source Car Control (OSCC) is an assemblage of software and hardware designs that enable computer control of modern cars in order to facilitate the development of autonomous vehicle technology. It is a modular and stable way of using software to interface with a vehicle’s communications network and control systems. Support only the 2014 or later Kia Soul but the API and firmware have been designed to make it easy to add new vehicle support. The software parts licensed under the MIT License.

I hope the information in this quick summary useful. Please give some applause if the information is helpful to you and follow me to see more posts in the future.

Cheers.

Originally published at blog.toluwa.tech on November 7, 2017.

--

--