[Click] schedule tasks [ns-3 + click]

Sascha Alexander Jopen jopen at informatik.uni-bonn.de
Thu Dec 22 14:54:20 EST 2011


Hey,

with the attached patch everything is fine. You missed to count the
iterations spend :-)
I think a maximum of 1000 iterations is enough, but i have to check this
with reasonable simulations. However, it would be nice if this parameter
would be configurable somehow.

Sascha


On 12/22/11 17:11, Eddie Kohler wrote:
> Hi Sascha,
> 
> Fair enough -- that's exactly the problem I was worried about.
> Please take a look at commit 2754456. Does it help? Do you advise a
> different constant?
> 
> Eddie
> 
> On 12/22/11 6:15 AM, Sascha Alexander Jopen wrote:
>> Hi,
>>
>> Altough i'm not really happy with those 1us steps, i think they are
>> still necessary. timeval_ceil() does help, when a task is scheduled
>> with some nanoseconds. If it is not, timeval_ceil() does nothing. Some
>> elements like InfiniteSource will reschedule a new task all the time.
>> Because all work done within the elements don't consume any simulation
>> time, this will lead to a rescheduled task for the exact same
>> timestamp, everytime, effectivly ending in an endless loop. Every
>> "polling" like Element will suffer from this problem.
>> If there is nothing else, which adds up some simulation time between
>> task executions within the click driver, then we need it there, just
>> before ns scheduling, don't we?
>>
>> A simple test script with only an InfiniteSource shows this behaviour:
>>
>> InfiniteSource
>>          ->  IPEncap(SRC eth0:ip, DST eth0:bcast,  TTL 255, PROTO 253)
>>          ->  Queue
>>          ->  IPPrint(Sending)
>>          ->  ToSimDevice(eth0);
>>
>> Sascha
>>
>> On 12/21/11 14:55, Eddie Kohler wrote:
>>> Hi Sascha, Björn,
>>>
>>> Removing the 1us steps WAS intentional. I thought maybe
>>> timeval_ceil() would be enough here. Sascha, I looked at the commit
>>> history to try to figure out why the 1us steps were necessary, but
>>> the commit message wasn't enough. Do you think the 1us is
>>> necessary? If so, why? Is there a risk that an always-active Click
>>> config will cause ns time to stop?
>>>
>>> Eddie
>>>
>>>
>>> 2011/12/21 Björn Lichtblau<lichtbla at informatik.hu-berlin.de>:
>>>> Hi,
>>>>
>>>> my tests went well so far, and i also noticed from the tests that
>>>> the artificial 1us steps i did not like were gone. If that's
>>>> intended: double thumbs up!
>>>>
>>>> Regards, Björn
>>>>
>>>> On 21.12.2011 11:50, Sascha Alexander Jopen wrote:
>>>>> Hey Eddie,
>>>>>
>>>>> i didnt' test you changes yet, but after looking into the code
>>>>> i think you accidently removed the newly introduced artificial
>>>>> time increase in lib/routerthread.cc line 683. I think this
>>>>> should read
>>>>>
>>>>> struct timeval nexttime = (Timestamp::now() +
>>>>> Timestamp::make_usec(1)).timeval_ceil();
>>>>>
>>>>> Regards, Sascha
>>>>>
>>>>> Am 20.12.2011 17:31, schrieb Eddie Kohler:
>>>>>> Hi Sascha,
>>>>>>
>>>>>> I applied a different version of your patch, using a new
>>>>>> function (Timestamp::timeval_ceil()) written for the purpose.
>>>>>> This also slightly changed the active() case in ns3
>>>>>> scheduling. Take a look; does it work for you?
>>>>>>
>>>>>> Best, Eddie
>>>>>>
>>>>>>
>>>>>> On Fri, Dec 16, 2011 at 5:40 PM, Sascha Alexander Jopen
>>>>>> <jopen at informatik.uni-bonn.de>   wrote:
>>>>>>> Hey Eddie,
>>>>>>>
>>>>>>> i think my patch is still necessary. The one integrated
>>>>>>> into ns3 is about rounding errors between double and
>>>>>>> integer time representations, which led to similar endless
>>>>>>> loops but on ns3 side.
>>>>>>>
>>>>>>> Regards, Sascha
>>>>>>>
>>>>>>> Am 16.12.2011 15:20, schrieb Eddie Kohler:
>>>>>>>> Hi Björn, Sascha,
>>>>>>>>
>>>>>>>> So, I'm a bit behind. Should I apply Sascha's patch, or
>>>>>>>> some other version?
>>>>>>>>
>>>>>>>> Best, Eddie
>>>>>>>>
>>>>>>>>
>>>>>>>> On 12/04/2011 10:25 AM, Björn Lichtblau wrote:
>>>>>>>>> Hi, this flaw was already fixed on ns3 side (in
>>>>>>>>> ns3-dev):
>>>>>>>>> http://code.nsnam.org/ns-3-dev/rev/0d04b625ea54
>>>>>>>>>
>>>>>>>>> Regarding Eddies change, it works but i had no time yet
>>>>>>>>> to test it extensively. While it works i think it's a
>>>>>>>>> little work around to let the simulator do artificial 1
>>>>>>>>> us steps... however it should not hurt in most cases
>>>>>>>>> and doing it cleaner my be hard.. i'll comment more
>>>>>>>>> detailed soon.
>>>>>>>>>
>>>>>>>>> Regards, Björn
>>>>>>>>>
>>>>>>>>> On 03.12.2011 23:42, Sascha Alexander Jopen wrote:
>>>>>>>>>> Hey,
>>>>>>>>>>
>>>>>>>>>> after running some real simulations i found that
>>>>>>>>>> scheduling timers may lead to infinite loops with
>>>>>>>>>> nsclicks scheduling. On 64bit systems the nanosecond
>>>>>>>>>> precision of a timestamp is mapped to a truncated
>>>>>>>>>> microsecond precision timeval. The next ns event is
>>>>>>>>>> scheduled at this microsecond timeval, but clicks
>>>>>>>>>> run_timers() method doesn't run the timer, because
>>>>>>>>>> its expiry is not reached. nsclick is then again
>>>>>>>>>> scheduled for this timer, again some nanoseconds to
>>>>>>>>>> early.
>>>>>>>>>>
>>>>>>>>>> The attached patch fixes this problem by scheduling
>>>>>>>>>> nsclick to the next microsecond after the timer
>>>>>>>>>> expires.
>>>>>>>>>>
>>>>>>>>>> Regards, Sascha
>>>>>>>>>>
>>>>>>>>>> On 11/09/11 18:56, Eddie Kohler wrote:
>>>>>>>>>>> Sascha, Björn,
>>>>>>>>>>>
>>>>>>>>>>> Thanks for your patience with this problem.  I took
>>>>>>>>>>> a look at the patch, but decided to do it a
>>>>>>>>>>> different way that is less intrusive and hopefully
>>>>>>>>>>> addresses all the cases.  The checkin is here:
>>>>>>>>>>>
>>>>>>>>>>> https://github.com/kohler/click/commit/e3c9f295ea1d5c5700e26f19afce873b0ce755f5
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>
>>>>>>>>>>>
>> Please let me know if this does not work (I have not tested it).
>>>>>>>>>>> Best, Eddie
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Nov 4, 2011 at 3:05 PM, Sascha Alexander
>>>>>>>>>>> Jopen<jopen at informatik.uni-bonn.de>     wrote:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> after rereading the stated problems, i found that
>>>>>>>>>>>> my patch did not catch the case when a timer
>>>>>>>>>>>> schedules a task. The attached patch should fix
>>>>>>>>>>>> this. Both test scenarios from Björn seem to run
>>>>>>>>>>>> as expected with this patch.
>>>>>>>>>>>>
>>>>>>>>>>>> However, i'm still not sure, how much time should
>>>>>>>>>>>> pass between two driver runs.
>>>>>>>>>>>>
>>>>>>>>>>>> Regards, Sascha
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 11/04/11 17:28, Sascha Alexander Jopen wrote:
>>>>>>>>>>>>> Hey,
>>>>>>>>>>>>>
>>>>>>>>>>>>> i use a slightly different fix. After running
>>>>>>>>>>>>> the tasks, i check if there is still at least
>>>>>>>>>>>>> one task scheduled. If this is the case, i
>>>>>>>>>>>>> reschedule the router thread for execution in
>>>>>>>>>>>>> ns-3 again with the smallest possible time
>>>>>>>>>>>>> offset, which is one microsecond. This way it
>>>>>>>>>>>>> doesn't matter if there are other timers to be
>>>>>>>>>>>>> executed or not. For elements which do polling
>>>>>>>>>>>>> using tasks all the time, this means that
>>>>>>>>>>>>> simulation time advances only one microsecond
>>>>>>>>>>>>> per iteration which could lead to really long
>>>>>>>>>>>>> running simulations. Using such elements is
>>>>>>>>>>>>> possible this way, however.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regards, Sascha
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 11/04/11 12:15, Björn Lichtblau wrote:
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> i think you are experiencing what i described
>>>>>>>>>>>>>> as problem A in
>>>>>>>>>>>>>> http://pdos.csail.mit.edu/pipermail/click/2011-October/010357.html
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>
>>>>>>>>>>>>>>
>> Tasks getting scheduled by a fired timer are not run immediatly, because
>>>>>>>>>>>>>> run_timers() is behind run_tasks() and the
>>>>>>>>>>>>>> routerthread-driver() loop is only run once
>>>>>>>>>>>>>> at each call from the simulation. The fix i
>>>>>>>>>>>>>> described there however was not correct,
>>>>>>>>>>>>>> currently i'm working with this (but maybe
>>>>>>>>>>>>>> incorrect too):
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> diff --git a/lib/routerthread.cc
>>>>>>>>>>>>>> b/lib/routerthread.cc index d118a91..7232b79
>>>>>>>>>>>>>> 100644 --- a/lib/routerthread.cc +++
>>>>>>>>>>>>>> b/lib/routerthread.cc @@ -640,14 +640,7 @@
>>>>>>>>>>>>>> RouterThread::driver() _oticks = ticks;
>>>>>>>>>>>>>> #endif timer_set().run_timers(this,
>>>>>>>>>>>>>> _master); -#if CLICK_NS -           // If
>>>>>>>>>>>>>> there's another timer, tell the simulator to
>>>>>>>>>>>>>> make us -           // run when it's due to
>>>>>>>>>>>>>> go off. -           if (Timestamp next_expiry
>>>>>>>>>>>>>> = timer_set().timer_expiry_steady()) { -
>>>>>>>>>>>>>> struct timeval nexttime =
>>>>>>>>>>>>>> next_expiry.timeval(); -
>>>>>>>>>>>>>> simclick_sim_command(_master->simnode(),
>>>>>>>>>>>>>> SIMCLICK_SCHEDULE,&nexttime); -           }
>>>>>>>>>>>>>> -#endif + } while (0);
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> // run operating system @@ -667,7 +660,20 @@
>>>>>>>>>>>>>> RouterThread::driver() #if CLICK_NS ||
>>>>>>>>>>>>>> BSD_NETISRSCHED // Everyone except the NS
>>>>>>>>>>>>>> driver stays in driver() until the driver is
>>>>>>>>>>>>>> // stopped. -       break; + if(task_begin()
>>>>>>>>>>>>>> == task_end()){ +#if CLICK_NS + // If there's
>>>>>>>>>>>>>> another timer, tell the simulator to make us
>>>>>>>>>>>>>> +           // run when it's due to go off.
>>>>>>>>>>>>>> + if (Timestamp next_expiry =
>>>>>>>>>>>>>> timer_set().timer_expiry_steady()) { + struct
>>>>>>>>>>>>>> timeval nexttime = next_expiry.timeval(); +
>>>>>>>>>>>>>> simclick_sim_command(_master->simnode(),
>>>>>>>>>>>>>> SIMCLICK_SCHEDULE,&nexttime); +           }
>>>>>>>>>>>>>> +#endif + break; + +       } #endif }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> So the driver loop only exits on task_begin()
>>>>>>>>>>>>>> == task_end() (means it will loop again if a
>>>>>>>>>>>>>> timer scheduled a task), and we only tell the
>>>>>>>>>>>>>> simulator to schedule a timer when the loop
>>>>>>>>>>>>>> is exiting. While this is fine for me a
>>>>>>>>>>>>>> colleague working with ns2 still has problems
>>>>>>>>>>>>>> with ToSimDevice in some cases which is too
>>>>>>>>>>>>>> much to describe now.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Another potential problem i stumbled over in
>>>>>>>>>>>>>> the documentation
>>>>>>>>>>>>>> (http://www.read.cs.ucla.edu/click/doxygen/classTimer.html)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>> is busy waiting: "Particularly at user level, there can
>>>>>>>>>>>>>> be a significant delay between a Timer's
>>>>>>>>>>>>>> nominal expiration time and the actual time
>>>>>>>>>>>>>> it runs. Elements that desire extremely
>>>>>>>>>>>>>> precise timings should combine a Timer with a
>>>>>>>>>>>>>> Task. The Timer is set to go off a bit before
>>>>>>>>>>>>>> the true expiration time (see
>>>>>>>>>>>>>> Timer::adjustment()), after which the Task
>>>>>>>>>>>>>> polls the CPU until the actual expiration
>>>>>>>>>>>>>> time arrives."
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Elements doing busy waiting are a big
>>>>>>>>>>>>>> headache in the sim environment, and no nice
>>>>>>>>>>>>>> idea yet to fix such cases, because time will
>>>>>>>>>>>>>> never advance without setting a timer...
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Regards, Björn
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 11/04/2011 11:29 AM, Giovanni Di Stasi
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I have developed an element which behaves
>>>>>>>>>>>>>>> like a Queue. It has a pull output which is
>>>>>>>>>>>>>>> connected to a ToSimDevice.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Sometimes, when the pull gets called on the
>>>>>>>>>>>>>>> element, it returns a NULL and sets a timer
>>>>>>>>>>>>>>> which expires after a few milliseconds
>>>>>>>>>>>>>>> (e.g. 3). When the timer expires, an
>>>>>>>>>>>>>>> empty_notifier is "activated"
>>>>>>>>>>>>>>> (empty_not.wake()). I would expect, at this
>>>>>>>>>>>>>>> point, the ToSimDevice task to be run, and
>>>>>>>>>>>>>>> the pull to be called on my element.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Unfortunately this does not happen. The
>>>>>>>>>>>>>>> Task schedule function  seems to be called
>>>>>>>>>>>>>>> after the notifier is waken up, but the
>>>>>>>>>>>>>>> task is not run. Is this normal? The task
>>>>>>>>>>>>>>> is sometimes run a few seconds later. So I
>>>>>>>>>>>>>>> have two doubts: is it possible to schedule
>>>>>>>>>>>>>>> a task to be run right away (maybe be
>>>>>>>>>>>>>>> putting it at the top of the Click pending
>>>>>>>>>>>>>>> tasks)? If not, which delay should I expect
>>>>>>>>>>>>>>> from the moment I schedule the Task and
>>>>>>>>>>>>>>> the moment it gets executed?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks, Giovanni
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>> click mailing list
>>>>>>>>>>>>>> click at amsterdam.lcs.mit.edu
>>>>>>>>>>>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>> _______________________________________________ click
>>>>>>>>>>>>> mailing list click at amsterdam.lcs.mit.edu
>>>>>>>>>>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>>>>>>>>>>
>>>>>>>>>>>>>
>> _______________________________________________ click
>>>>>>>>>>>> mailing list click at amsterdam.lcs.mit.edu
>>>>>>>>>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>>
>> _______________________________________________ click mailing
>>>>>>>>>> list click at amsterdam.lcs.mit.edu
>>>>>>>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>>>>>>> _______________________________________________ click
>>>>>>>>> mailing list click at amsterdam.lcs.mit.edu
>>>>>>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>>>>>> _______________________________________________ click
>>>>>>>> mailing list click at amsterdam.lcs.mit.edu
>>>>>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>>>>> _______________________________________________ click
>>>>>>> mailing list click at amsterdam.lcs.mit.edu
>>>>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>>> _______________________________________________ click mailing
>>>>> list click at amsterdam.lcs.mit.edu
>>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>>
>>>>
>>>> _______________________________________________ click mailing
>>>> list click at amsterdam.lcs.mit.edu
>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: routerthread-iter.patch
Type: text/x-patch
Size: 465 bytes
Desc: not available
Url : http://amsterdam.lcs.mit.edu/pipermail/click/attachments/20111222/f322d54d/attachment.bin 


More information about the click mailing list