[Click] schedule tasks [ns-3 + click]

Eddie Kohler ekohler at gmail.com
Thu Dec 22 11:11:32 EST 2011


Hi Sascha,

Fair enough -- that's exactly the problem I was worried about.
Please take a look at commit 2754456. Does it help? Do you advise a 
different constant?

Eddie

On 12/22/11 6:15 AM, Sascha Alexander Jopen wrote:
> Hi,
>
> Altough i'm not really happy with those 1us steps, i think they are
> still necessary. timeval_ceil() does help, when a task is scheduled
> with some nanoseconds. If it is not, timeval_ceil() does nothing. Some
> elements like InfiniteSource will reschedule a new task all the time.
> Because all work done within the elements don't consume any simulation
> time, this will lead to a rescheduled task for the exact same
> timestamp, everytime, effectivly ending in an endless loop. Every
> "polling" like Element will suffer from this problem.
> If there is nothing else, which adds up some simulation time between
> task executions within the click driver, then we need it there, just
> before ns scheduling, don't we?
>
> A simple test script with only an InfiniteSource shows this behaviour:
>
> InfiniteSource
>          ->  IPEncap(SRC eth0:ip, DST eth0:bcast,  TTL 255, PROTO 253)
>          ->  Queue
>          ->  IPPrint(Sending)
>          ->  ToSimDevice(eth0);
>
> Sascha
>
> On 12/21/11 14:55, Eddie Kohler wrote:
>> Hi Sascha, Björn,
>>
>> Removing the 1us steps WAS intentional. I thought maybe
>> timeval_ceil() would be enough here. Sascha, I looked at the commit
>> history to try to figure out why the 1us steps were necessary, but
>> the commit message wasn't enough. Do you think the 1us is
>> necessary? If so, why? Is there a risk that an always-active Click
>> config will cause ns time to stop?
>>
>> Eddie
>>
>>
>> 2011/12/21 Björn Lichtblau<lichtbla at informatik.hu-berlin.de>:
>>> Hi,
>>>
>>> my tests went well so far, and i also noticed from the tests that
>>> the artificial 1us steps i did not like were gone. If that's
>>> intended: double thumbs up!
>>>
>>> Regards, Björn
>>>
>>> On 21.12.2011 11:50, Sascha Alexander Jopen wrote:
>>>> Hey Eddie,
>>>>
>>>> i didnt' test you changes yet, but after looking into the code
>>>> i think you accidently removed the newly introduced artificial
>>>> time increase in lib/routerthread.cc line 683. I think this
>>>> should read
>>>>
>>>> struct timeval nexttime = (Timestamp::now() +
>>>> Timestamp::make_usec(1)).timeval_ceil();
>>>>
>>>> Regards, Sascha
>>>>
>>>> Am 20.12.2011 17:31, schrieb Eddie Kohler:
>>>>> Hi Sascha,
>>>>>
>>>>> I applied a different version of your patch, using a new
>>>>> function (Timestamp::timeval_ceil()) written for the purpose.
>>>>> This also slightly changed the active() case in ns3
>>>>> scheduling. Take a look; does it work for you?
>>>>>
>>>>> Best, Eddie
>>>>>
>>>>>
>>>>> On Fri, Dec 16, 2011 at 5:40 PM, Sascha Alexander Jopen
>>>>> <jopen at informatik.uni-bonn.de>   wrote:
>>>>>> Hey Eddie,
>>>>>>
>>>>>> i think my patch is still necessary. The one integrated
>>>>>> into ns3 is about rounding errors between double and
>>>>>> integer time representations, which led to similar endless
>>>>>> loops but on ns3 side.
>>>>>>
>>>>>> Regards, Sascha
>>>>>>
>>>>>> Am 16.12.2011 15:20, schrieb Eddie Kohler:
>>>>>>> Hi Björn, Sascha,
>>>>>>>
>>>>>>> So, I'm a bit behind. Should I apply Sascha's patch, or
>>>>>>> some other version?
>>>>>>>
>>>>>>> Best, Eddie
>>>>>>>
>>>>>>>
>>>>>>> On 12/04/2011 10:25 AM, Björn Lichtblau wrote:
>>>>>>>> Hi, this flaw was already fixed on ns3 side (in
>>>>>>>> ns3-dev):
>>>>>>>> http://code.nsnam.org/ns-3-dev/rev/0d04b625ea54
>>>>>>>>
>>>>>>>> Regarding Eddies change, it works but i had no time yet
>>>>>>>> to test it extensively. While it works i think it's a
>>>>>>>> little work around to let the simulator do artificial 1
>>>>>>>> us steps... however it should not hurt in most cases
>>>>>>>> and doing it cleaner my be hard.. i'll comment more
>>>>>>>> detailed soon.
>>>>>>>>
>>>>>>>> Regards, Björn
>>>>>>>>
>>>>>>>> On 03.12.2011 23:42, Sascha Alexander Jopen wrote:
>>>>>>>>> Hey,
>>>>>>>>>
>>>>>>>>> after running some real simulations i found that
>>>>>>>>> scheduling timers may lead to infinite loops with
>>>>>>>>> nsclicks scheduling. On 64bit systems the nanosecond
>>>>>>>>> precision of a timestamp is mapped to a truncated
>>>>>>>>> microsecond precision timeval. The next ns event is
>>>>>>>>> scheduled at this microsecond timeval, but clicks
>>>>>>>>> run_timers() method doesn't run the timer, because
>>>>>>>>> its expiry is not reached. nsclick is then again
>>>>>>>>> scheduled for this timer, again some nanoseconds to
>>>>>>>>> early.
>>>>>>>>>
>>>>>>>>> The attached patch fixes this problem by scheduling
>>>>>>>>> nsclick to the next microsecond after the timer
>>>>>>>>> expires.
>>>>>>>>>
>>>>>>>>> Regards, Sascha
>>>>>>>>>
>>>>>>>>> On 11/09/11 18:56, Eddie Kohler wrote:
>>>>>>>>>> Sascha, Björn,
>>>>>>>>>>
>>>>>>>>>> Thanks for your patience with this problem.  I took
>>>>>>>>>> a look at the patch, but decided to do it a
>>>>>>>>>> different way that is less intrusive and hopefully
>>>>>>>>>> addresses all the cases.  The checkin is here:
>>>>>>>>>>
>>>>>>>>>> https://github.com/kohler/click/commit/e3c9f295ea1d5c5700e26f19afce873b0ce755f5
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>>>>>>>
> Please let me know if this does not work (I have not tested it).
>>>>>>>>>> Best, Eddie
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Fri, Nov 4, 2011 at 3:05 PM, Sascha Alexander
>>>>>>>>>> Jopen<jopen at informatik.uni-bonn.de>     wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> after rereading the stated problems, i found that
>>>>>>>>>>> my patch did not catch the case when a timer
>>>>>>>>>>> schedules a task. The attached patch should fix
>>>>>>>>>>> this. Both test scenarios from Björn seem to run
>>>>>>>>>>> as expected with this patch.
>>>>>>>>>>>
>>>>>>>>>>> However, i'm still not sure, how much time should
>>>>>>>>>>> pass between two driver runs.
>>>>>>>>>>>
>>>>>>>>>>> Regards, Sascha
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 11/04/11 17:28, Sascha Alexander Jopen wrote:
>>>>>>>>>>>> Hey,
>>>>>>>>>>>>
>>>>>>>>>>>> i use a slightly different fix. After running
>>>>>>>>>>>> the tasks, i check if there is still at least
>>>>>>>>>>>> one task scheduled. If this is the case, i
>>>>>>>>>>>> reschedule the router thread for execution in
>>>>>>>>>>>> ns-3 again with the smallest possible time
>>>>>>>>>>>> offset, which is one microsecond. This way it
>>>>>>>>>>>> doesn't matter if there are other timers to be
>>>>>>>>>>>> executed or not. For elements which do polling
>>>>>>>>>>>> using tasks all the time, this means that
>>>>>>>>>>>> simulation time advances only one microsecond
>>>>>>>>>>>> per iteration which could lead to really long
>>>>>>>>>>>> running simulations. Using such elements is
>>>>>>>>>>>> possible this way, however.
>>>>>>>>>>>>
>>>>>>>>>>>> Regards, Sascha
>>>>>>>>>>>>
>>>>>>>>>>>> On 11/04/11 12:15, Björn Lichtblau wrote:
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> i think you are experiencing what i described
>>>>>>>>>>>>> as problem A in
>>>>>>>>>>>>> http://pdos.csail.mit.edu/pipermail/click/2011-October/010357.html
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>
>>>>>>>>>>>>>
> Tasks getting scheduled by a fired timer are not run immediatly, because
>>>>>>>>>>>>> run_timers() is behind run_tasks() and the
>>>>>>>>>>>>> routerthread-driver() loop is only run once
>>>>>>>>>>>>> at each call from the simulation. The fix i
>>>>>>>>>>>>> described there however was not correct,
>>>>>>>>>>>>> currently i'm working with this (but maybe
>>>>>>>>>>>>> incorrect too):
>>>>>>>>>>>>>
>>>>>>>>>>>>> diff --git a/lib/routerthread.cc
>>>>>>>>>>>>> b/lib/routerthread.cc index d118a91..7232b79
>>>>>>>>>>>>> 100644 --- a/lib/routerthread.cc +++
>>>>>>>>>>>>> b/lib/routerthread.cc @@ -640,14 +640,7 @@
>>>>>>>>>>>>> RouterThread::driver() _oticks = ticks;
>>>>>>>>>>>>> #endif timer_set().run_timers(this,
>>>>>>>>>>>>> _master); -#if CLICK_NS -           // If
>>>>>>>>>>>>> there's another timer, tell the simulator to
>>>>>>>>>>>>> make us -           // run when it's due to
>>>>>>>>>>>>> go off. -           if (Timestamp next_expiry
>>>>>>>>>>>>> = timer_set().timer_expiry_steady()) { -
>>>>>>>>>>>>> struct timeval nexttime =
>>>>>>>>>>>>> next_expiry.timeval(); -
>>>>>>>>>>>>> simclick_sim_command(_master->simnode(),
>>>>>>>>>>>>> SIMCLICK_SCHEDULE,&nexttime); -           }
>>>>>>>>>>>>> -#endif + } while (0);
>>>>>>>>>>>>>
>>>>>>>>>>>>> // run operating system @@ -667,7 +660,20 @@
>>>>>>>>>>>>> RouterThread::driver() #if CLICK_NS ||
>>>>>>>>>>>>> BSD_NETISRSCHED // Everyone except the NS
>>>>>>>>>>>>> driver stays in driver() until the driver is
>>>>>>>>>>>>> // stopped. -       break; + if(task_begin()
>>>>>>>>>>>>> == task_end()){ +#if CLICK_NS + // If there's
>>>>>>>>>>>>> another timer, tell the simulator to make us
>>>>>>>>>>>>> +           // run when it's due to go off.
>>>>>>>>>>>>> + if (Timestamp next_expiry =
>>>>>>>>>>>>> timer_set().timer_expiry_steady()) { + struct
>>>>>>>>>>>>> timeval nexttime = next_expiry.timeval(); +
>>>>>>>>>>>>> simclick_sim_command(_master->simnode(),
>>>>>>>>>>>>> SIMCLICK_SCHEDULE,&nexttime); +           }
>>>>>>>>>>>>> +#endif + break; + +       } #endif }
>>>>>>>>>>>>>
>>>>>>>>>>>>> So the driver loop only exits on task_begin()
>>>>>>>>>>>>> == task_end() (means it will loop again if a
>>>>>>>>>>>>> timer scheduled a task), and we only tell the
>>>>>>>>>>>>> simulator to schedule a timer when the loop
>>>>>>>>>>>>> is exiting. While this is fine for me a
>>>>>>>>>>>>> colleague working with ns2 still has problems
>>>>>>>>>>>>> with ToSimDevice in some cases which is too
>>>>>>>>>>>>> much to describe now.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Another potential problem i stumbled over in
>>>>>>>>>>>>> the documentation
>>>>>>>>>>>>> (http://www.read.cs.ucla.edu/click/doxygen/classTimer.html)
>>>>>>>>>>>>>
>>>>>>>>>>>>>
> is busy waiting: "Particularly at user level, there can
>>>>>>>>>>>>> be a significant delay between a Timer's
>>>>>>>>>>>>> nominal expiration time and the actual time
>>>>>>>>>>>>> it runs. Elements that desire extremely
>>>>>>>>>>>>> precise timings should combine a Timer with a
>>>>>>>>>>>>> Task. The Timer is set to go off a bit before
>>>>>>>>>>>>> the true expiration time (see
>>>>>>>>>>>>> Timer::adjustment()), after which the Task
>>>>>>>>>>>>> polls the CPU until the actual expiration
>>>>>>>>>>>>> time arrives."
>>>>>>>>>>>>>
>>>>>>>>>>>>> Elements doing busy waiting are a big
>>>>>>>>>>>>> headache in the sim environment, and no nice
>>>>>>>>>>>>> idea yet to fix such cases, because time will
>>>>>>>>>>>>> never advance without setting a timer...
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regards, Björn
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 11/04/2011 11:29 AM, Giovanni Di Stasi
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I have developed an element which behaves
>>>>>>>>>>>>>> like a Queue. It has a pull output which is
>>>>>>>>>>>>>> connected to a ToSimDevice.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Sometimes, when the pull gets called on the
>>>>>>>>>>>>>> element, it returns a NULL and sets a timer
>>>>>>>>>>>>>> which expires after a few milliseconds
>>>>>>>>>>>>>> (e.g. 3). When the timer expires, an
>>>>>>>>>>>>>> empty_notifier is "activated"
>>>>>>>>>>>>>> (empty_not.wake()). I would expect, at this
>>>>>>>>>>>>>> point, the ToSimDevice task to be run, and
>>>>>>>>>>>>>> the pull to be called on my element.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Unfortunately this does not happen. The
>>>>>>>>>>>>>> Task schedule function  seems to be called
>>>>>>>>>>>>>> after the notifier is waken up, but the
>>>>>>>>>>>>>> task is not run. Is this normal? The task
>>>>>>>>>>>>>> is sometimes run a few seconds later. So I
>>>>>>>>>>>>>> have two doubts: is it possible to schedule
>>>>>>>>>>>>>> a task to be run right away (maybe be
>>>>>>>>>>>>>> putting it at the top of the Click pending
>>>>>>>>>>>>>> tasks)? If not, which delay should I expect
>>>>>>>>>>>>>> from the moment I schedule the Task and
>>>>>>>>>>>>>> the moment it gets executed?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks, Giovanni
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> click mailing list
>>>>>>>>>>>>> click at amsterdam.lcs.mit.edu
>>>>>>>>>>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
> _______________________________________________ click
>>>>>>>>>>>> mailing list click at amsterdam.lcs.mit.edu
>>>>>>>>>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>>>>>>>>>
>>>>>>>>>>>>
> _______________________________________________ click
>>>>>>>>>>> mailing list click at amsterdam.lcs.mit.edu
>>>>>>>>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>
> _______________________________________________ click mailing
>>>>>>>>> list click at amsterdam.lcs.mit.edu
>>>>>>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>>>>>> _______________________________________________ click
>>>>>>>> mailing list click at amsterdam.lcs.mit.edu
>>>>>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>>>>> _______________________________________________ click
>>>>>>> mailing list click at amsterdam.lcs.mit.edu
>>>>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>>>> _______________________________________________ click
>>>>>> mailing list click at amsterdam.lcs.mit.edu
>>>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>> _______________________________________________ click mailing
>>>> list click at amsterdam.lcs.mit.edu
>>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>>>
>>>
>>> _______________________________________________ click mailing
>>> list click at amsterdam.lcs.mit.edu
>>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>


More information about the click mailing list