[Click] schedule tasks [ns-3 + click]

Sascha Alexander Jopen jopen at informatik.uni-bonn.de
Thu Dec 22 06:15:57 EST 2011


Hi,

Altough i'm not really happy with those 1us steps, i think they are
still necessary. timeval_ceil() does help, when a task is scheduled
with some nanoseconds. If it is not, timeval_ceil() does nothing. Some
elements like InfiniteSource will reschedule a new task all the time.
Because all work done within the elements don't consume any simulation
time, this will lead to a rescheduled task for the exact same
timestamp, everytime, effectivly ending in an endless loop. Every
"polling" like Element will suffer from this problem.
If there is nothing else, which adds up some simulation time between
task executions within the click driver, then we need it there, just
before ns scheduling, don't we?

A simple test script with only an InfiniteSource shows this behaviour:

InfiniteSource
        -> IPEncap(SRC eth0:ip, DST eth0:bcast,  TTL 255, PROTO 253)
        -> Queue
        -> IPPrint(Sending)
        -> ToSimDevice(eth0);

Sascha

On 12/21/11 14:55, Eddie Kohler wrote:
> Hi Sascha, Björn,
> 
> Removing the 1us steps WAS intentional. I thought maybe
> timeval_ceil() would be enough here. Sascha, I looked at the commit
> history to try to figure out why the 1us steps were necessary, but
> the commit message wasn't enough. Do you think the 1us is
> necessary? If so, why? Is there a risk that an always-active Click
> config will cause ns time to stop?
> 
> Eddie
> 
> 
> 2011/12/21 Björn Lichtblau <lichtbla at informatik.hu-berlin.de>:
>> Hi,
>> 
>> my tests went well so far, and i also noticed from the tests that
>> the artificial 1us steps i did not like were gone. If that's
>> intended: double thumbs up!
>> 
>> Regards, Björn
>> 
>> On 21.12.2011 11:50, Sascha Alexander Jopen wrote:
>>> Hey Eddie,
>>> 
>>> i didnt' test you changes yet, but after looking into the code
>>> i think you accidently removed the newly introduced artificial
>>> time increase in lib/routerthread.cc line 683. I think this
>>> should read
>>> 
>>> struct timeval nexttime = (Timestamp::now() + 
>>> Timestamp::make_usec(1)).timeval_ceil();
>>> 
>>> Regards, Sascha
>>> 
>>> Am 20.12.2011 17:31, schrieb Eddie Kohler:
>>>> Hi Sascha,
>>>> 
>>>> I applied a different version of your patch, using a new
>>>> function (Timestamp::timeval_ceil()) written for the purpose.
>>>> This also slightly changed the active() case in ns3
>>>> scheduling. Take a look; does it work for you?
>>>> 
>>>> Best, Eddie
>>>> 
>>>> 
>>>> On Fri, Dec 16, 2011 at 5:40 PM, Sascha Alexander Jopen 
>>>> <jopen at informatik.uni-bonn.de>  wrote:
>>>>> Hey Eddie,
>>>>> 
>>>>> i think my patch is still necessary. The one integrated
>>>>> into ns3 is about rounding errors between double and
>>>>> integer time representations, which led to similar endless
>>>>> loops but on ns3 side.
>>>>> 
>>>>> Regards, Sascha
>>>>> 
>>>>> Am 16.12.2011 15:20, schrieb Eddie Kohler:
>>>>>> Hi Björn, Sascha,
>>>>>> 
>>>>>> So, I'm a bit behind. Should I apply Sascha's patch, or
>>>>>> some other version?
>>>>>> 
>>>>>> Best, Eddie
>>>>>> 
>>>>>> 
>>>>>> On 12/04/2011 10:25 AM, Björn Lichtblau wrote:
>>>>>>> Hi, this flaw was already fixed on ns3 side (in
>>>>>>> ns3-dev): 
>>>>>>> http://code.nsnam.org/ns-3-dev/rev/0d04b625ea54
>>>>>>> 
>>>>>>> Regarding Eddies change, it works but i had no time yet
>>>>>>> to test it extensively. While it works i think it's a
>>>>>>> little work around to let the simulator do artificial 1
>>>>>>> us steps... however it should not hurt in most cases
>>>>>>> and doing it cleaner my be hard.. i'll comment more
>>>>>>> detailed soon.
>>>>>>> 
>>>>>>> Regards, Björn
>>>>>>> 
>>>>>>> On 03.12.2011 23:42, Sascha Alexander Jopen wrote:
>>>>>>>> Hey,
>>>>>>>> 
>>>>>>>> after running some real simulations i found that
>>>>>>>> scheduling timers may lead to infinite loops with
>>>>>>>> nsclicks scheduling. On 64bit systems the nanosecond
>>>>>>>> precision of a timestamp is mapped to a truncated
>>>>>>>> microsecond precision timeval. The next ns event is
>>>>>>>> scheduled at this microsecond timeval, but clicks 
>>>>>>>> run_timers() method doesn't run the timer, because
>>>>>>>> its expiry is not reached. nsclick is then again
>>>>>>>> scheduled for this timer, again some nanoseconds to
>>>>>>>> early.
>>>>>>>> 
>>>>>>>> The attached patch fixes this problem by scheduling
>>>>>>>> nsclick to the next microsecond after the timer
>>>>>>>> expires.
>>>>>>>> 
>>>>>>>> Regards, Sascha
>>>>>>>> 
>>>>>>>> On 11/09/11 18:56, Eddie Kohler wrote:
>>>>>>>>> Sascha, Björn,
>>>>>>>>> 
>>>>>>>>> Thanks for your patience with this problem.  I took
>>>>>>>>> a look at the patch, but decided to do it a
>>>>>>>>> different way that is less intrusive and hopefully
>>>>>>>>> addresses all the cases.  The checkin is here:
>>>>>>>>> 
>>>>>>>>> https://github.com/kohler/click/commit/e3c9f295ea1d5c5700e26f19afce873b0ce755f5
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>
>>>>>>>>> 
Please let me know if this does not work (I have not tested it).
>>>>>>>>> Best, Eddie
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Fri, Nov 4, 2011 at 3:05 PM, Sascha Alexander
>>>>>>>>> Jopen <jopen at informatik.uni-bonn.de>    wrote:
>>>>>>>>>> Hi,
>>>>>>>>>> 
>>>>>>>>>> after rereading the stated problems, i found that
>>>>>>>>>> my patch did not catch the case when a timer
>>>>>>>>>> schedules a task. The attached patch should fix
>>>>>>>>>> this. Both test scenarios from Björn seem to run
>>>>>>>>>> as expected with this patch.
>>>>>>>>>> 
>>>>>>>>>> However, i'm still not sure, how much time should
>>>>>>>>>> pass between two driver runs.
>>>>>>>>>> 
>>>>>>>>>> Regards, Sascha
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On 11/04/11 17:28, Sascha Alexander Jopen wrote:
>>>>>>>>>>> Hey,
>>>>>>>>>>> 
>>>>>>>>>>> i use a slightly different fix. After running
>>>>>>>>>>> the tasks, i check if there is still at least
>>>>>>>>>>> one task scheduled. If this is the case, i
>>>>>>>>>>> reschedule the router thread for execution in
>>>>>>>>>>> ns-3 again with the smallest possible time 
>>>>>>>>>>> offset, which is one microsecond. This way it
>>>>>>>>>>> doesn't matter if there are other timers to be
>>>>>>>>>>> executed or not. For elements which do polling
>>>>>>>>>>> using tasks all the time, this means that
>>>>>>>>>>> simulation time advances only one microsecond
>>>>>>>>>>> per iteration which could lead to really long 
>>>>>>>>>>> running simulations. Using such elements is
>>>>>>>>>>> possible this way, however.
>>>>>>>>>>> 
>>>>>>>>>>> Regards, Sascha
>>>>>>>>>>> 
>>>>>>>>>>> On 11/04/11 12:15, Björn Lichtblau wrote:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>> 
>>>>>>>>>>>> i think you are experiencing what i described
>>>>>>>>>>>> as problem A in 
>>>>>>>>>>>> http://pdos.csail.mit.edu/pipermail/click/2011-October/010357.html
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>
>>>>>>>>>>>> 
Tasks getting scheduled by a fired timer are not run immediatly, because
>>>>>>>>>>>> run_timers() is behind run_tasks() and the 
>>>>>>>>>>>> routerthread-driver() loop is only run once
>>>>>>>>>>>> at each call from the simulation. The fix i
>>>>>>>>>>>> described there however was not correct,
>>>>>>>>>>>> currently i'm working with this (but maybe
>>>>>>>>>>>> incorrect too):
>>>>>>>>>>>> 
>>>>>>>>>>>> diff --git a/lib/routerthread.cc
>>>>>>>>>>>> b/lib/routerthread.cc index d118a91..7232b79
>>>>>>>>>>>> 100644 --- a/lib/routerthread.cc +++
>>>>>>>>>>>> b/lib/routerthread.cc @@ -640,14 +640,7 @@
>>>>>>>>>>>> RouterThread::driver() _oticks = ticks;
>>>>>>>>>>>> #endif timer_set().run_timers(this,
>>>>>>>>>>>> _master); -#if CLICK_NS -           // If
>>>>>>>>>>>> there's another timer, tell the simulator to
>>>>>>>>>>>> make us -           // run when it's due to
>>>>>>>>>>>> go off. -           if (Timestamp next_expiry
>>>>>>>>>>>> = timer_set().timer_expiry_steady()) { - 
>>>>>>>>>>>> struct timeval nexttime =
>>>>>>>>>>>> next_expiry.timeval(); - 
>>>>>>>>>>>> simclick_sim_command(_master->simnode(), 
>>>>>>>>>>>> SIMCLICK_SCHEDULE,&nexttime); -           }
>>>>>>>>>>>> -#endif + } while (0);
>>>>>>>>>>>> 
>>>>>>>>>>>> // run operating system @@ -667,7 +660,20 @@ 
>>>>>>>>>>>> RouterThread::driver() #if CLICK_NS ||
>>>>>>>>>>>> BSD_NETISRSCHED // Everyone except the NS
>>>>>>>>>>>> driver stays in driver() until the driver is
>>>>>>>>>>>> // stopped. -       break; + if(task_begin()
>>>>>>>>>>>> == task_end()){ +#if CLICK_NS + // If there's
>>>>>>>>>>>> another timer, tell the simulator to make us
>>>>>>>>>>>> +           // run when it's due to go off.
>>>>>>>>>>>> + if (Timestamp next_expiry = 
>>>>>>>>>>>> timer_set().timer_expiry_steady()) { + struct
>>>>>>>>>>>> timeval nexttime = next_expiry.timeval(); + 
>>>>>>>>>>>> simclick_sim_command(_master->simnode(), 
>>>>>>>>>>>> SIMCLICK_SCHEDULE,&nexttime); +           }
>>>>>>>>>>>> +#endif + break; + +       } #endif }
>>>>>>>>>>>> 
>>>>>>>>>>>> So the driver loop only exits on task_begin()
>>>>>>>>>>>> == task_end() (means it will loop again if a
>>>>>>>>>>>> timer scheduled a task), and we only tell the
>>>>>>>>>>>> simulator to schedule a timer when the loop
>>>>>>>>>>>> is exiting. While this is fine for me a
>>>>>>>>>>>> colleague working with ns2 still has problems
>>>>>>>>>>>> with ToSimDevice in some cases which is too 
>>>>>>>>>>>> much to describe now.
>>>>>>>>>>>> 
>>>>>>>>>>>> Another potential problem i stumbled over in
>>>>>>>>>>>> the documentation 
>>>>>>>>>>>> (http://www.read.cs.ucla.edu/click/doxygen/classTimer.html)
>>>>>>>>>>>>
>>>>>>>>>>>> 
is busy waiting: "Particularly at user level, there can
>>>>>>>>>>>> be a significant delay between a Timer's
>>>>>>>>>>>> nominal expiration time and the actual time
>>>>>>>>>>>> it runs. Elements that desire extremely
>>>>>>>>>>>> precise timings should combine a Timer with a
>>>>>>>>>>>> Task. The Timer is set to go off a bit before
>>>>>>>>>>>> the true expiration time (see 
>>>>>>>>>>>> Timer::adjustment()), after which the Task
>>>>>>>>>>>> polls the CPU until the actual expiration
>>>>>>>>>>>> time arrives."
>>>>>>>>>>>> 
>>>>>>>>>>>> Elements doing busy waiting are a big
>>>>>>>>>>>> headache in the sim environment, and no nice
>>>>>>>>>>>> idea yet to fix such cases, because time will
>>>>>>>>>>>> never advance without setting a timer...
>>>>>>>>>>>> 
>>>>>>>>>>>> Regards, Björn
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> On 11/04/2011 11:29 AM, Giovanni Di Stasi
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I have developed an element which behaves
>>>>>>>>>>>>> like a Queue. It has a pull output which is
>>>>>>>>>>>>> connected to a ToSimDevice.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Sometimes, when the pull gets called on the
>>>>>>>>>>>>> element, it returns a NULL and sets a timer
>>>>>>>>>>>>> which expires after a few milliseconds
>>>>>>>>>>>>> (e.g. 3). When the timer expires, an
>>>>>>>>>>>>> empty_notifier is "activated" 
>>>>>>>>>>>>> (empty_not.wake()). I would expect, at this
>>>>>>>>>>>>> point, the ToSimDevice task to be run, and
>>>>>>>>>>>>> the pull to be called on my element.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Unfortunately this does not happen. The
>>>>>>>>>>>>> Task schedule function  seems to be called
>>>>>>>>>>>>> after the notifier is waken up, but the
>>>>>>>>>>>>> task is not run. Is this normal? The task
>>>>>>>>>>>>> is sometimes run a few seconds later. So I 
>>>>>>>>>>>>> have two doubts: is it possible to schedule
>>>>>>>>>>>>> a task to be run right away (maybe be
>>>>>>>>>>>>> putting it at the top of the Click pending
>>>>>>>>>>>>> tasks)? If not, which delay should I expect
>>>>>>>>>>>>> from the moment I schedule the Task and
>>>>>>>>>>>>> the moment it gets executed?
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks, Giovanni
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> click mailing list
>>>>>>>>>>>> click at amsterdam.lcs.mit.edu 
>>>>>>>>>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> 
_______________________________________________ click
>>>>>>>>>>> mailing list click at amsterdam.lcs.mit.edu 
>>>>>>>>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>>>>>>>>
>>>>>>>>>>> 
_______________________________________________ click
>>>>>>>>>> mailing list click at amsterdam.lcs.mit.edu 
>>>>>>>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>> 
_______________________________________________ click mailing
>>>>>>>> list click at amsterdam.lcs.mit.edu 
>>>>>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>>>>> _______________________________________________ click
>>>>>>> mailing list click at amsterdam.lcs.mit.edu 
>>>>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>>>> _______________________________________________ click
>>>>>> mailing list click at amsterdam.lcs.mit.edu 
>>>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>>> _______________________________________________ click
>>>>> mailing list click at amsterdam.lcs.mit.edu 
>>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>> _______________________________________________ click mailing
>>> list click at amsterdam.lcs.mit.edu 
>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>> 
>> 
>> _______________________________________________ click mailing
>> list click at amsterdam.lcs.mit.edu 
>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click



More information about the click mailing list