[Click] schedule tasks [ns-3 + click]

Eddie Kohler ekohler at gmail.com
Thu Dec 22 15:37:20 EST 2011


Hey Sascha, your patch double-increments _active_iter, I am incrementing 
it in the if statement, which might be a bad habit, but works. Right? :)

E


On 12/22/11 2:54 PM, Sascha Alexander Jopen wrote:
> Hey,
>
> with the attached patch everything is fine. You missed to count the
> iterations spend :-)
> I think a maximum of 1000 iterations is enough, but i have to check this
> with reasonable simulations. However, it would be nice if this parameter
> would be configurable somehow.
>
> Sascha
>
>
> On 12/22/11 17:11, Eddie Kohler wrote:
>> Hi Sascha,
>>
>> Fair enough -- that's exactly the problem I was worried about.
>> Please take a look at commit 2754456. Does it help? Do you advise a
>> different constant?
>>
>> Eddie
>>
>> On 12/22/11 6:15 AM, Sascha Alexander Jopen wrote:
>>> Hi,
>>>
>>> Altough i'm not really happy with those 1us steps, i think they are
>>> still necessary. timeval_ceil() does help, when a task is scheduled
>>> with some nanoseconds. If it is not, timeval_ceil() does nothing. Some
>>> elements like InfiniteSource will reschedule a new task all the time.
>>> Because all work done within the elements don't consume any simulation
>>> time, this will lead to a rescheduled task for the exact same
>>> timestamp, everytime, effectivly ending in an endless loop. Every
>>> "polling" like Element will suffer from this problem.
>>> If there is nothing else, which adds up some simulation time between
>>> task executions within the click driver, then we need it there, just
>>> before ns scheduling, don't we?
>>>
>>> A simple test script with only an InfiniteSource shows this behaviour:
>>>
>>> InfiniteSource
>>>           ->   IPEncap(SRC eth0:ip, DST eth0:bcast,  TTL 255, PROTO 253)
>>>           ->   Queue
>>>           ->   IPPrint(Sending)
>>>           ->   ToSimDevice(eth0);
>>>
>>> Sascha
>>>
>>> On 12/21/11 14:55, Eddie Kohler wrote:
>>>> Hi Sascha, Björn,
>>>>
>>>> Removing the 1us steps WAS intentional. I thought maybe
>>>> timeval_ceil() would be enough here. Sascha, I looked at the commit
>>>> history to try to figure out why the 1us steps were necessary, but
>>>> the commit message wasn't enough. Do you think the 1us is
>>>> necessary? If so, why? Is there a risk that an always-active Click
>>>> config will cause ns time to stop?
>>>>
>>>> Eddie
>>>>
>>>>
>>>> 2011/12/21 Björn Lichtblau<lichtbla at informatik.hu-berlin.de>:
>>>>> Hi,
>>>>>
>>>>> my tests went well so far, and i also noticed from the tests that
>>>>> the artificial 1us steps i did not like were gone. If that's
>>>>> intended: double thumbs up!
>>>>>
>>>>> Regards, Björn
>>>>>
>>>>> On 21.12.2011 11:50, Sascha Alexander Jopen wrote:
>>>>>> Hey Eddie,
>>>>>>
>>>>>> i didnt' test you changes yet, but after looking into the code
>>>>>> i think you accidently removed the newly introduced artificial
>>>>>> time increase in lib/routerthread.cc line 683. I think this
>>>>>> should read
>>>>>>
>>>>>> struct timeval nexttime = (Timestamp::now() +
>>>>>> Timestamp::make_usec(1)).timeval_ceil();
>>>>>>
>>>>>> Regards, Sascha
>>>>>>
>>>>>> Am 20.12.2011 17:31, schrieb Eddie Kohler:
>>>>>>> Hi Sascha,
>>>>>>>
>>>>>>> I applied a different version of your patch, using a new
>>>>>>> function (Timestamp::timeval_ceil()) written for the purpose.
>>>>>>> This also slightly changed the active() case in ns3
>>>>>>> scheduling. Take a look; does it work for you?
>>>>>>>
>>>>>>> Best, Eddie
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Dec 16, 2011 at 5:40 PM, Sascha Alexander Jopen
>>>>>>> <jopen at informatik.uni-bonn.de>    wrote:
>>>>>>>> Hey Eddie,
>>>>>>>>
>>>>>>>> i think my patch is still necessary. The one integrated
>>>>>>>> into ns3 is about rounding errors between double and
>>>>>>>> integer time representations, which led to similar endless
>>>>>>>> loops but on ns3 side.
>>>>>>>>
>>>>>>>> Regards, Sascha
>>>>>>>>
>>>>>>>> Am 16.12.2011 15:20, schrieb Eddie Kohler:
>>>>>>>>> Hi Björn, Sascha,
>>>>>>>>>
>>>>>>>>> So, I'm a bit behind. Should I apply Sascha's patch, or
>>>>>>>>> some other version?
>>>>>>>>>
>>>>>>>>> Best, Eddie
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 12/04/2011 10:25 AM, Björn Lichtblau wrote:
>>>>>>>>>> Hi, this flaw was already fixed on ns3 side (in
>>>>>>>>>> ns3-dev):
>>>>>>>>>> http://code.nsnam.org/ns-3-dev/rev/0d04b625ea54
>>>>>>>>>>
>>>>>>>>>> Regarding Eddies change, it works but i had no time yet
>>>>>>>>>> to test it extensively. While it works i think it's a
>>>>>>>>>> little work around to let the simulator do artificial 1
>>>>>>>>>> us steps... however it should not hurt in most cases
>>>>>>>>>> and doing it cleaner my be hard.. i'll comment more
>>>>>>>>>> detailed soon.
>>>>>>>>>>
>>>>>>>>>> Regards, Björn
>>>>>>>>>>
>>>>>>>>>> On 03.12.2011 23:42, Sascha Alexander Jopen wrote:
>>>>>>>>>>> Hey,
>>>>>>>>>>>
>>>>>>>>>>> after running some real simulations i found that
>>>>>>>>>>> scheduling timers may lead to infinite loops with
>>>>>>>>>>> nsclicks scheduling. On 64bit systems the nanosecond
>>>>>>>>>>> precision of a timestamp is mapped to a truncated
>>>>>>>>>>> microsecond precision timeval. The next ns event is
>>>>>>>>>>> scheduled at this microsecond timeval, but clicks
>>>>>>>>>>> run_timers() method doesn't run the timer, because
>>>>>>>>>>> its expiry is not reached. nsclick is then again
>>>>>>>>>>> scheduled for this timer, again some nanoseconds to
>>>>>>>>>>> early.
>>>>>>>>>>>
>>>>>>>>>>> The attached patch fixes this problem by scheduling
>>>>>>>>>>> nsclick to the next microsecond after the timer
>>>>>>>>>>> expires.
>>>>>>>>>>>
>>>>>>>>>>> Regards, Sascha
>>>>>>>>>>>
>>>>>>>>>>> On 11/09/11 18:56, Eddie Kohler wrote:
>>>>>>>>>>>> Sascha, Björn,
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for your patience with this problem.  I took
>>>>>>>>>>>> a look at the patch, but decided to do it a
>>>>>>>>>>>> different way that is less intrusive and hopefully
>>>>>>>>>>>> addresses all the cases.  The checkin is here:
>>>>>>>>>>>>
>>>>>>>>>>>> https://github.com/kohler/click/commit/e3c9f295ea1d5c5700e26f19afce873b0ce755f5
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>
>>>>>>>>>>>>
>>> Please let me know if this does not work (I have not tested it).
>>>>>>>>>>>> Best, Eddie
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Nov 4, 2011 at 3:05 PM, Sascha Alexander
>>>>>>>>>>>> Jopen<jopen at informatik.uni-bonn.de>      wrote:
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> after rereading the stated problems, i found that
>>>>>>>>>>>>> my patch did not catch the case when a timer
>>>>>>>>>>>>> schedules a task. The attached patch should fix
>>>>>>>>>>>>> this. Both test scenarios from Björn seem to run
>>>>>>>>>>>>> as expected with this patch.
>>>>>>>>>>>>>
>>>>>>>>>>>>> However, i'm still not sure, how much time should
>>>>>>>>>>>>> pass between two driver runs.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regards, Sascha
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 11/04/11 17:28, Sascha Alexander Jopen wrote:
>>>>>>>>>>>>>> Hey,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> i use a slightly different fix. After running
>>>>>>>>>>>>>> the tasks, i check if there is still at least
>>>>>>>>>>>>>> one task scheduled. If this is the case, i
>>>>>>>>>>>>>> reschedule the router thread for execution in
>>>>>>>>>>>>>> ns-3 again with the smallest possible time
>>>>>>>>>>>>>> offset, which is one microsecond. This way it
>>>>>>>>>>>>>> doesn't matter if there are other timers to be
>>>>>>>>>>>>>> executed or not. For elements which do polling
>>>>>>>>>>>>>> using tasks all the time, this means that
>>>>>>>>>>>>>> simulation time advances only one microsecond
>>>>>>>>>>>>>> per iteration which could lead to really long
>>>>>>>>>>>>>> running simulations. Using such elements is
>>>>>>>>>>>>>> possible this way, however.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Regards, Sascha
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 11/04/11 12:15, Björn Lichtblau wrote:
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> i think you are experiencing what i described
>>>>>>>>>>>>>>> as problem A in
>>>>>>>>>>>>>>> http://pdos.csail.mit.edu/pipermail/click/2011-October/010357.html
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>
>>>>>>>>>>>>>>>
>>> Tasks getting scheduled by a fired timer are not run immediatly, because
>>>>>>>>>>>>>>> run_timers() is behind run_tasks() and the
>>>>>>>>>>>>>>> routerthread-driver() loop is only run once
>>>>>>>>>>>>>>> at each call from the simulation. The fix i
>>>>>>>>>>>>>>> described there however was not correct,
>>>>>>>>>>>>>>> currently i'm working with this (but maybe
>>>>>>>>>>>>>>> incorrect too):
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> diff --git a/lib/routerthread.cc
>>>>>>>>>>>>>>> b/lib/routerthread.cc index d118a91..7232b79
>>>>>>>>>>>>>>> 100644 --- a/lib/routerthread.cc +++
>>>>>>>>>>>>>>> b/lib/routerthread.cc @@ -640,14 +640,7 @@
>>>>>>>>>>>>>>> RouterThread::driver() _oticks = ticks;
>>>>>>>>>>>>>>> #endif timer_set().run_timers(this,
>>>>>>>>>>>>>>> _master); -#if CLICK_NS -           // If
>>>>>>>>>>>>>>> there's another timer, tell the simulator to
>>>>>>>>>>>>>>> make us -           // run when it's due to
>>>>>>>>>>>>>>> go off. -           if (Timestamp next_expiry
>>>>>>>>>>>>>>> = timer_set().timer_expiry_steady()) { -
>>>>>>>>>>>>>>> struct timeval nexttime =
>>>>>>>>>>>>>>> next_expiry.timeval(); -
>>>>>>>>>>>>>>> simclick_sim_command(_master->simnode(),
>>>>>>>>>>>>>>> SIMCLICK_SCHEDULE,&nexttime); -           }
>>>>>>>>>>>>>>> -#endif + } while (0);
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> // run operating system @@ -667,7 +660,20 @@
>>>>>>>>>>>>>>> RouterThread::driver() #if CLICK_NS ||
>>>>>>>>>>>>>>> BSD_NETISRSCHED // Everyone except the NS
>>>>>>>>>>>>>>> driver stays in driver() until the driver is
>>>>>>>>>>>>>>> // stopped. -       break; + if(task_begin()
>>>>>>>>>>>>>>> == task_end()){ +#if CLICK_NS + // If there's
>>>>>>>>>>>>>>> another timer, tell the simulator to make us
>>>>>>>>>>>>>>> +           // run when it's due to go off.
>>>>>>>>>>>>>>> + if (Timestamp next_expiry =
>>>>>>>>>>>>>>> timer_set().timer_expiry_steady()) { + struct
>>>>>>>>>>>>>>> timeval nexttime = next_expiry.timeval(); +
>>>>>>>>>>>>>>> simclick_sim_command(_master->simnode(),
>>>>>>>>>>>>>>> SIMCLICK_SCHEDULE,&nexttime); +           }
>>>>>>>>>>>>>>> +#endif + break; + +       } #endif }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> So the driver loop only exits on task_begin()
>>>>>>>>>>>>>>> == task_end() (means it will loop again if a
>>>>>>>>>>>>>>> timer scheduled a task), and we only tell the
>>>>>>>>>>>>>>> simulator to schedule a timer when the loop
>>>>>>>>>>>>>>> is exiting. While this is fine for me a
>>>>>>>>>>>>>>> colleague working with ns2 still has problems
>>>>>>>>>>>>>>> with ToSimDevice in some cases which is too
>>>>>>>>>>>>>>> much to describe now.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Another potential problem i stumbled over in
>>>>>>>>>>>>>>> the documentation
>>>>>>>>>>>>>>> (http://www.read.cs.ucla.edu/click/doxygen/classTimer.html)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>> is busy waiting: "Particularly at user level, there can
>>>>>>>>>>>>>>> be a significant delay between a Timer's
>>>>>>>>>>>>>>> nominal expiration time and the actual time
>>>>>>>>>>>>>>> it runs. Elements that desire extremely
>>>>>>>>>>>>>>> precise timings should combine a Timer with a
>>>>>>>>>>>>>>> Task. The Timer is set to go off a bit before
>>>>>>>>>>>>>>> the true expiration time (see
>>>>>>>>>>>>>>> Timer::adjustment()), after which the Task
>>>>>>>>>>>>>>> polls the CPU until the actual expiration
>>>>>>>>>>>>>>> time arrives."
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Elements doing busy waiting are a big
>>>>>>>>>>>>>>> headache in the sim environment, and no nice
>>>>>>>>>>>>>>> idea yet to fix such cases, because time will
>>>>>>>>>>>>>>> never advance without setting a timer...
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Regards, Björn
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 11/04/2011 11:29 AM, Giovanni Di Stasi
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I have developed an element which behaves
>>>>>>>>>>>>>>>> like a Queue. It has a pull output which is
>>>>>>>>>>>>>>>> connected to a ToSimDevice.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Sometimes, when the pull gets called on the
>>>>>>>>>>>>>>>> element, it returns a NULL and sets a timer
>>>>>>>>>>>>>>>> which expires after a few milliseconds
>>>>>>>>>>>>>>>> (e.g. 3). When the timer expires, an
>>>>>>>>>>>>>>>> empty_notifier is "activated"
>>>>>>>>>>>>>>>> (empty_not.wake()). I would expect, at this
>>>>>>>>>>>>>>>> point, the ToSimDevice task to be run, and
>>>>>>>>>>>>>>>> the pull to be called on my element.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Unfortunately this does not happen. The
>>>>>>>>>>>>>>>> Task schedule function  seems to be called
>>>>>>>>>>>>>>>> after the notifier is waken up, but the
>>>>>>>>>>>>>>>> task is not run. Is this normal? The task
>>>>>>>>>>>>>>>> is sometimes run a few seconds later. So I
>>>>>>>>>>>>>>>> have two doubts: is it possible to schedule
>>>>>>>>>>>>>>>> a task to be run right away (maybe be
>>>>>>>>>>>>>>>> putting it at the top of the Click pending
>>>>>>>>>>>>>>>> tasks)? If not, which delay should I expect
>>>>>>>>>>>>>>>> from the moment I schedule the Task and
>>>>>>>>>>>>>>>> the moment it gets executed?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks, Giovanni
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>> click mailing list
>>>>>>>>>>>>>>> click at amsterdam.lcs.mit.edu
>>>>>>>>>>>>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>> _______________________________________________ click
>>>>>>>>>>>>>> mailing list click at amsterdam.lcs.mit.edu
>>>>>>>>>>>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>> _______________________________________________ click
>>>>>>>>>>>>> mailing list click at amsterdam.lcs.mit.edu
>>>>>>>>>>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>
>>> _______________________________________________ click mailing
>>>>>>>>>>> list click at amsterdam.lcs.mit.edu
>>>>>>>>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>>>>>>>> _______________________________________________ click
>>>>>>>>>> mailing list click at amsterdam.lcs.mit.edu
>>>>>>>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>>>>>>> _______________________________________________ click
>>>>>>>>> mailing list click at amsterdam.lcs.mit.edu
>>>>>>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>>>>>> _______________________________________________ click
>>>>>>>> mailing list click at amsterdam.lcs.mit.edu
>>>>>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>>>> _______________________________________________ click mailing
>>>>>> list click at amsterdam.lcs.mit.edu
>>>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>>>
>>>>>
>>>>> _______________________________________________ click mailing
>>>>> list click at amsterdam.lcs.mit.edu
>>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>
>


More information about the click mailing list