। जय श्री भगवान् ।
In the last post I talked about how to add a system call to a x86 or x86_64 system. There are a couple of ways when a user space application might want to interact with kernel for example I/O operations or some performance statistics or maybe a special device has its own set of ioctl calls which the program want to use. We saw one example, that is using a system call by which a user space program can interact with kernel however it's not possible and even required to be adding system calls.
We'll take a look at one of the most simple interfaces for interacting with kernel here, which is the sysfs. Although it's not "as simple" as you may think, but we can leave out a lot of things like locking, allocating memory, file operations etc. So we can just focus on one thing, that is the easiest way to get data into and out of the kernel. Usually sysfs is used for making module interaction and exporting device specific information however it can be used for literally anything you want to accomplish. So let's dive into the basics first what exactly is sysfs
The basic idea of having a Sysfs
Sysfs was created mainly for devices and kernel modules wishing to export/import information. The information to be exported can't be more than PAGE_SIZE (usually 4KiB) however depending on how one is implementing the Sysfs files you can accomplish quite much more. So the basic idea is that when a module wants interactivity from user space for example a device that can be turned off by the root user by writing a specific command in the device's register then he/she shouldn't have to go all the way to write a program doing ioctl's. Instead, the device's driver module can create Sysfs entries allowing the root user to just do echo <command_value> /sys/<sysfs_file> which would take care of everything.
The whole sysfs is based on the idea of kobject, which represents some kind of entity. A kobject may have a parent and may have many children which again are kobjects. So the basic idea is something like shown below,
Sysfs Structure |
So basically the idea is to group a certain type of kobjects and put them under that type. By default you can see the sysfs entries in /sys, and depending on the type of kobject you wish to implement it could be added under one of these. Since this is a very gentle introduction to kobjects we'll rather not use any parent and add our kobjects directly under /sys. So let's see what do we need to know in order to do this,
The following is the listing of kobject structure,
struct kobject { const char *name; struct list_head entry; struct kobject *parent; struct kset *kset; struct kobj_type *ktype; struct sysfs_dirent *sd; struct kref kref; unsigned int state_initialized:1; unsigned int state_in_sysfs:1; unsigned int state_add_uevent_sent:1; unsigned int state_remove_uevent_sent:1; unsigned int uevent_suppress:1; };
The above structure seems daunting there's a lot going on there however we don't need to bother about most of it right now and just need the name, kref and the parent. The rest are used for internal kobject maintenance. The Leaf kobjects are the ones where the real thing happens. These Leaf kobjects are implemented by the module writer using attributes or in specific kobj_attribute. The following shows the listing for both,
struct attribute { const char *name; umode_t mode; #ifdef CONFIG_DEBUG_LOCK_ALLOC bool ignore_lockdep:1; struct lock_class_key *key; struct lock_class_key skey; #endif };
struct kobj_attribute { struct attribute attr; ssize_t (*show)(struct kobject *kobj, struct kobj_attribute *attr, char *buf); ssize_t (*store)(struct kobject *kobj, struct kobj_attribute *attr, const char *buf, size_t count); };
As you can see the kobj_attribute embeds the attribute structure however it also provides methods to show and store information from/to user/kernel. Thus it all boils down to the following steps that need to be done
- Create a parent for our attributes. This is required since we don't want to have our attributes coming up under sysfs directly.
- Create some kobj_attribute structures and set the store and show on these.
Kernel Module using Sysfs, Kobjects and kobj_attribute
The idea of this module is to have
- A parent directory, that is a parent Kobject.
- Two attributes that store their information in a static array.
#include <common.h> #include <linux/sysfs.h> #define ROOT_KOBJ_NAME "pks_kobj" #define ROOT_ATTR1_NAME "pks_kobj_attr1" #define ROOT_ATTR2_NAME "pks_kobj_attr2" #define ROOT_ATTRS_COUNT 2 ssize_t rootfs_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf); ssize_t rootfs_store(struct kobject *kobj, struct kobj_attribute *attr, const char *buf, size_t count); /* 1 Word of storage per attribute */ #define ROOT_ATTR_STORAGE_SIZE (ROOT_ATTRS_COUNT * sizeof(unsigned long)) /* * This is our directory sort of for our sysfs files */ struct kobject *root_kobj; /* * These are the files under pks_kobj we'll see. * */ struct kobj_attribute root_kobj_attr1 = __ATTR(root_kobj_attr1, S_IWUSR|S_IRUGO, rootfs_show, rootfs_store); struct kobj_attribute root_kobj_attr2 = __ATTR(root_kobj_attr2, S_IWUSR|S_IRUGO, rootfs_show, rootfs_store); const struct attribute *root_kobj_attr[] = { &root_kobj_attr1.attr, &root_kobj_attr2.attr, NULL}; /* * We need storage to get/put data from/to user land. Let's just create * a static array for this. */ static char attribute_storage[ROOT_ATTR_STORAGE_SIZE]; static int __init init_sysfs_objs(struct kobject *root_kobj_parent) { int err = 0; root_kobj = kobject_create_and_add(ROOT_KOBJ_NAME, root_kobj_parent); if (!root_kobj) { err = -ENOMEM; goto no_root_kobj; } err = sysfs_create_files(root_kobj, root_kobj_attr); if (err) goto err_create_files; return 0; err_create_files: kobject_put(root_kobj); no_root_kobj: return err; } static int __init load_module(void) { return init_sysfs_objs(NULL); } static void __exit cleanup_sysfs_objs(void) { sysfs_remove_files(root_kobj, root_kobj_attr); kobject_put(root_kobj); } static void __exit unload_module(void) { cleanup_sysfs_objs(); } ssize_t rootfs_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf) { unsigned long *storage = (unsigned long*)attribute_storage; //pr_debug("Copying to user space from attribute %s\n", attr->attr.name); if (attr == &root_kobj_attr1) { } else if (attr == &root_kobj_attr2) { storage++; } *( (unsigned long*)buf) = *storage; return sizeof(unsigned long); } ssize_t rootfs_store(struct kobject *kobj, struct kobj_attribute *attr, const char *buf, size_t count) { unsigned long *storage = (unsigned long*)attribute_storage; //pr_debug("Copying from user space to attribute %s\n", attr->attr.name); if (attr == &root_kobj_attr1) { } else if (attr == &root_kobj_attr2) { storage++; } pr_debug("Changing from %lu to %lu \n", *storage, *( (unsigned long*)buf)); *storage = *( (unsigned long*)buf); return sizeof(unsigned long); } module_init(load_module); module_exit(unload_module);
Creating the root directory for our kobj_attributes
In the above listing, we've created one directory represented by our root_kobj. All kobjects should be created dynamically and not statically. Therefore we've used a function kobject_create_and_add for this purpose. If you see the final argument of that function then we've supplied NULL which means this kobject doesn't have any parent and would thus appear directly under /sys.Creating the kobj_attributes
The attributes you would like to show would almost always be declared statically since you know what you want to show in sysfs for your device or whatever purpose you are creating those entries. To facilitate this kernel provides the macro __ATTR for initializing the kobj_attribute. This attribute takes the variable name as it's first argument and uses it by stringify-ing it so we don't even need the names defined at the top.
Another important thing to note here is that for each of the attribute you'll have to specify a show and store method. Most of the time you'll have some common code to be executed so there are two ways in which you can do this,
- Provide a common routine and check which attribute is passed in by comparing the pointer to your statically defined kobj_attribute
- Provide wrappers over the kobj_attribute and then do container_of to get the containing attribute structure and go forward that way. This requires a bit more work and you may not even want this.
We've used an available wrapper function sysfs_create_files, the first argument of this function is the kobject under which we will create these attributes while the second is an array of pointers, see how we've specified NULL at the end of this array. This is mandatory since this function will iterate over the array unless it finds a NULL entry because there's no length field supplied.
Copying Data to/from user space
The store method implies that
- You are copying data from user land to kernel
- You will return how much data you've copied. Usually just return same amount as passed in but copy whatever amount you really want.
- You are copying data from kernel to user land
- You'll return how much data you are copying into the buffer.
The internal buffer is just an array. It holds the value as an unsigned long for each of the attributes.
Cleaning up,
You'll need to remove the files you created the same way you've added the files. Just be sure you do it reverse that is first remove the files then remove the parent kobject. To remove the root_kobj all you need to do is call kobject_put. This decrements the count of kobject and when the count goes to 0, it cleans up this kobject. This is why it's required that you remove the files first and then remove the parent kobject.
Excercises
- Modify the above module so that the first byte of each attribute's storage area represents 8 bit flags. That is the data can be stored in only 3 of the 4 bytes on a 32 bit computer and 7 of the 8 bytes on 64 bit computer.
- Write test programs, a producer and consumer that will write/read data respectively. Use the flag byte for any synchronization you may need. If the buffer is already full and producer hasn't consumed then you should check the flag byte if the data can be over written or not. This will be set/unset randomly by your producer on each write. If data can't be over written then you should return an error code or just 0 to convey nothing was written.
- Try implementing a wrapper over attributes and see how you can use container_of to accomplish the same. Think about what you'll need in your wrapper structure.
We'll again visit Sysfs later on for sure when we dive into device drivers.