Kernel.org Google Summer of Code 2010 Idea's page
This list is not exhaustive, and we welcome new suggestions. Some of these ideas are not in themselves complete projects; feel free to ask us how much work is likely to be involved, and how many ideas you might sensibly attempt as a Summer of Code project. The approximate difficulty level of each idea has been marked with one or more + symbols; the more +, the more difficult the project is expected to be.
|NOTE for Linux Kernel related development
While kernel.org supports the Linux Kernel, we are currently focusing on infrastructure related projects, as can be seen below. If you are looking to work on a Linux Kernel related project I would suggest the following Projects:
Should a Linux Kernel related project become available at kernel.org we will update the ideas page here, and change this notice. However, currently, kernel.org is not looking to take on any Linux Kernel development for GSoC 2010.
++ Centralized statistics gathering
This is a multi-part project involving both the collection of the statistics and the server aggregation of the statistics. The main idea of this project is to create a universally usable statistics download statistics collection. The Open Source community has a tendency to rely on a wide flung array of servers and infrastructure to provide it's download distribution. This works wonderfully for the most part, however there is little insight into the mirrors themselves from the position of the originator of the data. This lack of insight is due to a multitude of problems, from privacy concerns and legal reasons to system to system resources on the mirror itself.
This project is intended to help both the mirrors themselves and the upstream providers of data get a better handle on how many downloads of various things are actually occurring. It's intended to be an all encompassing solution, meaning that the project will work equally well for something like kernel.org, to Fedora, to Ubuntu, to Apache and to Mozilla should they choose to use it. This project will involve both a frontend log parser capable of determining what downloads have occurred, the type of download and how much data was transferred, as well as unique downloaders for that server. There will also be a backend portion of this, that will initially be hosted on kernel.org. This backend will be the collection point for the statistics that will be provided by frontend processes running on the mirrors. It will involve logging statistics, parsing out duplicates from a single mirror, deal with mirror authenticity and aggregating the statistics. It will also provide a website for individuals to be able to quickly browse and discover common downloads from a particular distribution, or open source project.
Things of Note about this project:
- There is both a client and a server aspect of this project, both pieces need to be created and interoperable along with a client/server api.
- Resource constrained environment
- Needs to be lightweight and as efficient as possible
- Potential to be processing 10s or 100s of Gigabytes of data on a single run fora single machine
- Will be collecting data from a variety of different log types from http, ftp, rsync, git, etc.
- Mostly a web-app, for reporting and data collection
- Needs to be relatively efficient, but not to the same extent as the client
- Has to be capable of running independent of the kernel.org infrastructure
- General todos:
- Prototype client
- Prototype server
- Prototype API
++ chasmd improvements
- Primary: Robert Escriva
- Secondary: Ben Boeckel
- Assisting: John "Warthog9" Hawley
CHASM, the Cryptographic Hash Algorithm Secured Mirroring solution, is a project that is to help alleviate a lot of the pains that mirrors have in organizing and verifying their content. The project can be thought of as a stateful rsync daemon in some respects, and is a project that kernel.org and a number of other large mirroring infrastructures have been looking into for several years now. This is ultimately a project that will be used by a greater portion of the larger mirroring infrastructures and as such has a lot of need for high performance and good design.
This is a project to help get CHASM to a usable and production quality state, it is currently in the middle of a rewrite into C++ for performance reasons and there are still several aspects that may need to be flushed out. Individuals will need a solid understanding of *NIX systems programming in C or C++ (C++ is mainly used to provide things like destructors and type safety). Familiarity with the git scm storage model, and rsync internals are both positive traits.
Developers seeking to work on CHASM will be working primarily on developing network code, including documenting the network protocols. Students will be expected to be able to develop such code/protocols independently, but will be provided every chance for feedback and guidance from the current developers so as to maximize the impact of their contributions.
Students looking to work on CHASM should contact the current developers, and register on the bug tracker (http://projects.robescriva.com/account/register).
Things to note about this project:
- There are several servers involved in this project; most of which communicate locally over Unix domain sockets.
- Each server will be a separate piece of functionality.
- All code written should be accompanied by test code to aid in automated testing (see http://cdash.chasmd.org/ for our dashboard).
- C++ is the language used by current developers. We chose C++ for its beneficial standard library and ability to link C libraries as well.
- Code written must be capable of running for extended periods of time without excess resource consumption or leakage.
++ boot.kernel.org improvements
NOTE: This project will be working closely with the Etherboot project and should be considered as cross-listed there. All of the requirements of the Etherboot project will also apply to this project.
Boot.kernel.org is a universal network booting system that has taken off in the last year. It was a Google Summer of Code project in 2009, and it is a possibility for inclusion in 2010. Improvements include:
- More automatic creation of bootable live images
- Addition of more distributions in the live images
- Fedora (working cleanly)
- Inclusion of NFSv4 where appropriate for live images
- Clean-ups to the build process
- Working with Etherboot to get better scripting in gPXE for BKO use
- Working with projects like BFO (boot.fedoraproject.org)
- Creation of packages for distributions for installation into boot menus (example is already being created for Fedora / EPEL)
- Wireless support
I'm sure there are more things to work on, but this is a very inter-project interdisciplinary project requiring knowledge of how systems boot, including a good understanding of how PXE normally works, to understanding makefile structures and C development. This project will work closely with the Etherboot project and individuals who apply for this should understand that they will likely be mentored by individuals from both kernel.org and Etherboot. As such individuals should be familiar with Etherboot's GSoC requirements and procedures as well as kernel.org's.
test.kernel.org provides an infrastructure for automatically building and testing each new version of the Linux kernel. Sparse provides a static analysis tool that finds problems in the Linux kernel and issues compile-time warnings about these problems. Find a way to make these two things work with each other.
- Get some automated build infrastructure such as test.kernel.org to run current versions of Sparse on each new kernel.
- Do something creative with the build logs. Recommendation: look into Al Viro's tools to remap line numbers in build logs based on diffs, and then diff the warnings in build logs between kernel versions to find out when warnings appear or disappear.
- Extra credit: add support for other static analysis tools, such as smatch.