[Click] CPU scheduling -> make elements interruptable.

Fri Jun 29 09:08:22 EDT 2007

Hello,

This sounds like a reasonable solution that might not make it too hard 
to integrate.  Take a look at the BalancedThreadSched element.  It is 
supposed to measure task duration and then schedule tasks based on the 
empirical observations it has done.  I haven't looked at its source code 
myself though, just basing it on the document description.  However, you 
will find quite a few elements in the linuxmodule that have a lot of 
accounting information from Eddie's thesis, although most of the 
accounting info that I have seen is per element.

Roman

Egi, Norbert wrote:
> Hi,
>  
> Roman Chertov wrote:
>> If the processing of the push/pull path is constant then you can treat
>> it as a "quanta".  Then you just need to maintain a FIFO scheduler which
>> schedules different clients on a per quanta bases.  That might work as well.
> 
> The push/pull path of each client can be different, we are even thinking on allowing the clients to create their own path, so the paths' processing time can differ largely from each other. By the way, in the initial prototype configuration I have a Classifier after the PollDevice that identifies to which client the packet belongs to and pushes the packet out on the output designated to that clients path. After each output there is a queue and an Unqueue element per each queue. These Unqueue elements schould be scheduled relative to each other in order to achieve a relatively fair CPU share. With the use of ScheduleInfo this seems to work fine when the paths are identical. I realized that having an extra queue and Unqueue element per client means that the packets are scheduled an extra time, which can mean some performance loss, but my forwarding performance tests haven't shown a significant difference. There is some loss, but it seems quite negligible with my configurations
.
>  
> However, Beyers' suggestion seems an interesting solution for the problem.
>  
> In addition, I was also thinking on a reactive solution that measures the time a task needed to run through the entire path of elements (or just a slice of the path in the case of Beyers' suggestion). We would also define a time quantum that would only be used to compare the run time of a task to it. That is, after the task returned we would calculate "X = time_consumed_by_the_task / quantum" and would increase this task's "pass" with "X*stride" instead of "stride", as it's described in section 2.4 of the Stride Scheduling paper. In this case, if a task lasted longer than it should have (i.e. a quantum) then it will be scheduled later (in proportion to the exceeded time) than normally.
> I'm not sure how much overhead it means to check the time twice per each scheduled task, but if not much this might be a simpler update too, even though this soultion in itself wouldn't prevent the system from loops in the element path as opposed to Beyers' suggestion, but that might be sorted out by a parser or something.
>  
> What do you think about this?
> 
> Thanks for the answers,
> Norbert
> 
> Beyers Cronje wrote:
>> Interesting subject. As Roman pointed out each scheduled task will run
>> through the entire push/pull path before it returns and would probably
>> take some effort to implement timeslices with task interruption support.
>>
>> The following is just a thought and might not be a viable option. One
>> way to do this would probably be between push()/pull() calls. Have an
>> intermediate function (lets call it prePush() ) per element that's
>> executed before each element's push() function. prePush() can then check
>> if the allocated forwarding path has exceeded it's timeslice. If it has
>> it will store a global pointer for this push/pull path and return
>> without continuing, effectively ending the scheduled task. Once the task
>> is scheduled again it can either continue where it left off using this
>> global pointer, or start a new push/pull process. The reason for prePush
>> will make life easier updating existing element code.
>>
>> To illustrate:
>>
>> Operation now:
>> Element1::run_task() -> Element2::push() -> Element3::push() .....
>>
>> Proposed:
>> Element1::run_task() -> Element2::prepush() -> Element2::push() ->
>> Element3::prepush() -> Element3::push() ......
>>
>> Say timeslice ends in Element3::prepush() the the next task schedule
>> path will look like this:
>> Element1::run_task() -> Element3::prepush() -> Element3::push() ......
>>
>> Beyers
>>
>> On 6/28/07, *Roman Chertov* <rchertov at purdue.edu
>> <mailto:rchertov at purdue.edu>> wrote:
>>
>>     In Click you have one or more threads of execution.  Most of the
>>     elements in Click are agnostic and do not generate any tasks themselves.
>>       When an element like a Queue/PollDevice/ToDevice schedules a task and
>>     a packet is available, then this task will run through the entire chain
>>     of elements until it gets into another Queue/PollDevice/ToDevice
>>     element.  (Click SMP paper). So in order to what you are proposing, you
>>     would have to interrupt a Click thread in the middle of execution of
>>     some task and then to start a new one.  (obviously you have to remember
>>     the state so that you can come back).  I don't think this would be easy
>>     to do, as I don't think the executing threads (RouterThread) are
>>     designed to be interrupted.
>>
>>     Hopefully Eddie will shine more light on this.
>>
>>     Roman
>>
>>     Egi, Norbert wrote:
>>      > Hi,
>>      >
>>      > We are using Click for desinging a shared forwarding engine with
>>     separate forwarding paths (chain of elements) per customer. The
>>     customers share all the NICs, so after the packets are polled out
>>     from the NIC's queue a Classifier demultiplexes them by emitting
>>     them to the proper forwarding path. Our intention is to provide a
>>     fair CPU scheduling for each of the forwarding paths. The stride
>>     scheduling algorithm, implemented in Click, would be a perfect
>>     solution for our problem, but after checking the source code and the
>>     available documents I found out that the algorithm hasn't been
>>     implemented completely as it was proposed. If I understand it
>>     correctly from the source and its comments, there is no such thing
>>     like "quantums" ( i.e. a discrete time slice when the scheduled task
>>     is entitled to use the CPU) and I guess that's the main reason while
>>     the operation of the elements can not be interrupted.
>>      >
>>      > In the first comment of lib/task.cc it's mentioned that this may
>>     be addressed in the future, so I was wondering whether anyone is
>>     working on it and I may jump in and help or just could someone
>>     provide information on how to make elements interruptable the
>>     fastest way extending the current version of the code.
>>      >
>>      > Regards,
>>      > Norbert
>>      >
>>      > _______________________________________________
>>      > click mailing list
>>      > click at amsterdam.lcs.mit.edu <mailto:click at amsterdam.lcs.mit.edu>
>>      > https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>      >
>>
>>     _______________________________________________
>>     click mailing list
>>     click at amsterdam.lcs.mit.edu <mailto:click at amsterdam.lcs.mit.edu>
>>     https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>
>>
> 
> 
>