Q:  In Singularity, is it possible to run programs written in unsafe languages,
such as C/C++/assembly?

A: The system that the paper describes could not run programs written in
anything other than Sing#.

To support programs written in unsafe languages, Singularity would have
to use hardware VM protection for isolation. Also, since IPC relies on
all programs following the channel contracts, that would somehow have to
be verified or enforced for programs not in Sing#.

Q: A problem with this new way of designing systems is that of backwards
compatibility. Most of the systems we have seen so far have major differences in
design, but at a certain point, can all be made to act very similarly (POSIX has
been implemented for all major systems, and can be implemented for JOS, even if
inefficiently). However Sigularity is too different, which really prevents
platform independent software. Why use such systems if the vast majority of
preexisting software cannot be easily ported to it? Isn't this an insurmountable
hurdle in adoption?

A: Singularity only runs programs written in Sing# -- it doesn't run
programs written in C. Most people run lots of C programs on their
laptops, web servers, &c, and the effort required to re-implement all
that software in Sing# would be huge. For most people, that cost would
be far greater than the benefits they'd get from switching to
Singularity.

Q: In general, operating systems on typical computers don't seem to be
particularly catered to a particular language. Typical programming languages are
built based on the properties of the OS, not the other way around. From a normal
user perspective, this enables freedom of choice in what language to use
(although some are better than others). When would it be a good idea to use what
basically seems like an OS + language combo? It seems a little impractical,
although the considerations in designing the OS are interesting.

A: An O/S that can support many languages does seem more generally useful than
one that's tied to a single language. For that reason (and probably others) I
suspect that it would be difficult to persuade users of laptops, desktops, and
web servers to adopt Singularity.

On the other hand, someone who cared a lot about reliability might think
Singularity's approach was great. For example, if you are designing the embedded
controller for a car's brake system, you might be very willing to write all the
software in a single type-safe language, and care not at all about support for
other languages.

Singularity was a research project to explore questions like "what might the
benefits be if programs were required to be written in a single type-safe
language"? Now we know a lot about the answers, and we can make more informed
decisions on these topics. In that sense Singularity is a successful research
project. The authors are not trying to persuade anyone to use Singularity; at
most they are trying to persuade other O/S designers to think about their ideas.

Q:  If Singularity is not being used for performance reasons, what aspects of
Singularity are used in modern operating systems?

A: Microsoft and others use lots of automated tools to help verify
correctness and find bugs in systems code, an approach which Singularity
espoused.

The linear type system idea has been picked up and extended by Rust, a
language intended for (among other things) kernel development.

The general idea of signed software components with verified properties
(the manifests) is older than Singularity, but has been gradually
creeping into operating systems for a long time.

Q:  When can we expect an OS like Singularity to be in use?

A: I expect a gradual shift to type-safe languages, but it's not clear that
that implies that there will be a need or desire for operating systems
like Singularity. I think one lesson from the paper is that the ideas,
while interesting and thought-provoking, do not provide benefits that
are so compelling that anyone is going to be motivated to switch their
laptop or web server from Linux to Singularity. It's more likely that
future systems will adopt individual techniques (e.g. channel contracts
for IPC).

Q: Is it very practical / used outside of research?

A: I don't think Singularity is (or was) used outside the Singularity
research group. It probably would not be practical for any of us to use
Singularity instead of, say, Linux, because we'd have to first re-write
our favorite programs from C to Sing#. It would be more practical if you
wanted to build a small self-contained system, like an embedded
controller.

Q: Have there been other "memory safe" OSs developed since Singularity? Has this
kind of work not seen attention by the community because of the performance
overheads imposed by type and memory safety?

A: I'm not aware of a subsequent single-address space operating system. There
was a previous project you might be interested in called Spin.

The idea of using high-level languages in the kernel is alive and well; have a
look at the Rust language.

I think Singularity's idea of using a single safe language for everything hasn't
caught on because 1) it imposes significant limitations such as not being able
to run existing programs written in C &c, and 2) the benefits are only modest,
so people aren't very motivated to switch from whatever they are currently
doing. The general idea of using automated analysis tools to look for bugs in
systems code, on the other hand, has become a lot more popular in the last ten
or 15 years.

Q: Does the reliance on 1970s OS's designs also stem in a large part from the
prevailing CPU architecture coming from the 1970s? If so, how does one address
the chicken and egg problem of the cooperation of new OS and architecture
technology?

A: I don't think O/S architecture is significantly limited by hardware
design right now. Most operating system services and abstractions are
much higher-level than the hardware. For example, file systems require a
disk (or SSD or perhaps RAM) to store the files, but file system
architecture is not really constrained by what the disk can do.
Similarly, network protocols like TCP require a network, but are
architecturally fairly independent of the details of the network. I
think O/S architectural innovation is driven mostly by high-level demand
(e.g. the recent need for virtualization and containers in cloud
computing).

Of course the O/S changes all the time at a more detailed level in
response to hardware changes. There has been a huge amount of this over
the decades: operating systems have added support for paging, networks,
graphical displays, big address spaces, multi-core, virtualization
hardware, &c.

A situation in which a new incompatible O/S might be attractive is in a
brand-new platform for which compatibility isn't an issue. Smart phones
were like this when they first arrived.

Q: How hard is it to write something for Singularity?

A: I don't know -- I've never used Singularity; my only basis for opinion
is what is in the paper. My guess is it wouldn't be harder to write a
given piece of code in Sing# than in C. In practice I think you would be
more limited by lack of libraries &c than by Singularity's design.

Q:  How does garbage collection provide memory safety?

A: GC is one part of Singularity's overall plan for memory safety.

The Sing# byte-code verifier does much of the work for memory safety, by
checking that the program never creates new pointers. That is, it's OK
for the program to use a pointer it got from the memory allocator, or to
store pointers in data structures and then read them and use them later,
but the verifier checks that the program never creates invalid pointers
like

  char *p = (char *) 9999; // cast integer to pointer

One of the things that might go wrong is that the program might have a
valid pointer, but then the memory the pointer points to is freed and
re-used for an object with a different type. If that could happen, the
program could dereference the pointer and perhaps use an arbitrary
integer as a pointer. This is a common bug in C code, because C requires
the program to explicitly free() memory. For example:

struct T1 {
  char *p;
  };

struct T2 {
  int x;
  };

fn(){
  T1 * t1 = new T1();
    free(t1);
      T2 * t2 = new T2(); // maybe now t2 == t1
        t2->x = 9999;
	  t1->p[0] = 1; // writing to address 9999 -- illegal!
	  }

The garbage collector guarantees that memory is never freed and re-used
while there is still a valid pointer pointing to it. The collector does
this by looking at all the live pointers (on the stack, in registers, in
reachable data), and only freeing memory that isn't reachable by any
chain of pointers. So, in Sing#, the above code would have no free(t1),
and the collector would notice that the program still had a reference to
*t1 at the end of the function, and not free/reuse it until after the
last use.

Q: Why is doing everything in software faster? I've heard generally that
hardware acceleration of things makes it faster/better.

A: For process isolation, the traditional hardware approach involves using
paging, which costs a little bit of time (maybe a cycle) on each memory
reference while the CPU hardware checks in the page table (or TLB) to
see if the access is allowed. Singularity instead performs static
verification of programs before they run, which allows Singularity (in
many cases) to do nothing at all at run-time to enforce isolation. So in
this case, the software approach of doing nothing at run-time is faster
than the hardware approach of checking each memory access against the
page table.

Q: How are larger programs or more demanding programs loaded into SIPs? Given
  that you can not dynamically load code into a SIP after it is created, how
  would you go about circumventing the restriction of program size then when
  using programs whose binaries are too large to all fit in memory at once?

A: I agree that Singularity programs have to fit in RAM. If you were desperate,
you could split a program into pieces, and run only one piece at a time.

Q: At first I  thought there several SIPs could access an  endpoint on a channel
but given  the mutual  exclusion guarantee  mentioned in  the last  paragraph of
section 3.2.1,  does this mean  that only  one SIP can  access an endpoint  of a
channel?  (and therefore  a locking  scheme is  not required  to perform  mutual
exclusion?)

A: Only one SIP can have a reference to each end point. There is never a
situation where two SIPs could try to send on the same end point (or
receive on it) at the same time, so I think you're right that SIPs don't
have to lock end points.

On the other hand the sender might send at the same time that the
receiver receives, so the Singularity designers must have taken steps to
make this potential race work out correctly. The paper says that channel
contracts are required to enforce alternation in message direction,
which probably helps. Maybe also the send and receive system call
implementations (inside the kernel) use locks.

Q: Why do the authors of Singularity view excessive SIP creation as an issue?
 One SIP is associated with a security principal, which seems to be an
 appropriate usage pattern.

A: The authors are thinking of situations where a single server needs to
act as different users (principals) at different times. Since there's no
way in Singularity to change a SIP's principle, the only way to act as a
different principle is to create a new SIP with the different principle.

Q: Could the exchange heap become a bottleneck for communication because of the
zeo allocation mechanism?

A: I suspect that use of the exchange heap could be very efficient, but the
coding style might be a bit awkward. The exchange heap is in shared
memory, so it has the potential to be very fast by avoiding copies. But
it might be hard in practice to avoid copying data from the sending
SIP's heap to the exchange heap, and from the exchange heap into the
receiving SIP's heap. I think applications that care a lot about
performance in this area might, through careful coding, be able to
manipulate data directly in the exchange heap, without copying.

On the other hand, computers can copy data from memory to memory at
gigabytes per second, so few applications are likely to be limited by
the need to copy data into and out of the exchange heap.

A: How is the safety of a MBP verified? What if there is an error in the
manifest?

Q: Have a look at this paper for more about manifests and sealing:

https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/tr-2006-51.pdf

Q: Doesn't the MBP feature limit portability across systems?

A: I imagine the authors intend MBPs to allow programs to be portable from
one Singularity computer to another, as long as both provide the API and
server versions that the program needs.

Q: It seems that linked stacks could be inefficient if one is calling several
short-lived functions in a row from a stack frame whose stack pointer is near
the end of one of the linked segments, because the OS would constantly be
linking and unlinking a new stack segment to make space for the new function
call. The benefit of linked stacks is stated in the paper as reducing thread
memory overhead, but it seems that it could have more overhead from memory
linking/unlinking and fragmentation than a standard dedicated stack approach in
situations like this - are linked stacks actually worth it in practice or do
such situations prevent their adoption?

A: You are right, there is a potential bad case in which there are repeated
calls right at the boundary of a stack segment. I don't know if Sing#
has a solution. Maybe the problem doesn't come up very often for
Singularity programs.

Fixed-size thread stacks are also a potential problem due to memory use.
They are fine if you have only hundreds of threads. But there are styles
of programming which involve thousands of threads, and freqent thread
creation/deletion, and then big fixed-sized stacks can be a disaster.

People use both schemes. Most programs are happy either way; a few
programs really need one or the other.

Q: Does it result in unusually verbose code? For example, the contract-based
channels seems to require a very detailed and complete contract knowing exactly
every possible state for the channel.

A: Sometimes things like contracts turn out to make programming easier, because
they make interfaces clearer to programmers, or they allow use of automated
tools. Ordinary type-checking is like that -- I'm glad that I have to tell the C
compiler the types of my variables. And, having written lots of IPC and RPC
code, I have found that automated support for typed messages is a huge win. But
I don't know for the specific case of Singularity contracts.

Q: I don't quite get the difference between a manifest and a contract. They both
seem to be imposing restrictions/rules on how programs should behave.

A: A contract declares the types of messages sent via IPC between SIPs.
E.g. a contract might say "an OPEN request to the file server must
consist of a string (the file name)."

Manifests describe all the different parts of a program that it needs in
order to run. E.g. which libraries it uses, what devices (if any) it
needs access to, which servers it will talk to. When you run a program,
the O/S uses the manifest to get the program started.


Q: In the second-to-last paragraph on page 2, the authors mention that context
  switches between software-isolated processes (SIPs) have very low overhead as
  the TLB and virtually addressed caches need not be flushed.  I understand that
  they don't need to be flushed because the two processes being switched between
  are running in the same virtual address space (because memory protection
  between processes is software- rather than hardware-based).  It seems like
  even though the TLB and cache don't need to be flushed, as the two processes
  are working in non-overlapping areas of memory (as they can't share), there
  will still be a lot of cache misses (and possibly TLB misses) when the
  switched-to process starts running. Does invalidating the TLB and cache (the
  actual operations, not any fallout that might come from them) actually have a
  high enough cost where removing those operations significantly increases the
  speed of process switching? (It seems that most of the fallout that normally
  comes from flushing the TLB and cache would happen anyway on a process switch,
  as the two processes are not working from the same space in virtual/physical
  memory.)

A: For a pagetable-per-process O/S like xv6 or Linux, flushing the TLB is not
itself an expensive operation. But the consequence is that there will be lots of
TLB misses immediately the flush, each of which takes dozens of cycles.

Singularity doesn't use the TLB at all, and thus doesn't suffer from TLB
misses.

Singularity and its applications will still incur data and instruction
cache misses, and the miss rates will probably be higher after context
switches. So context switch will still have some indirect cost.

Q: In the first paragraph of section 3.3.2 (Scheduler) on page 5, it says:
      "Whenever a scheduling timer interrupt occurs, all threads
      in the unblocked list are moved to the end of the preempted list,
      followed by the thread that was running when the timer fired.
      Then, the first thread from the unblocked list is scheduled and the
      scheduling timer is reset."
  I don't understand how this works - if, when a scheduling timer interrupt
  occurs, we move all threads on the unblocked list to the preempted list (so
  that the unblocked list is empty), how can we then schedule the first thread
  from the unblocked list? Does a scheduling timer interrupt do more than just
  signal for another process to be run? For that matter, why do all the
  unblocked threads move to the preempted list? Do they somehow lose priority
  over the preempted threads once the timer interrupt happens?
  ...Or, is it because the threads in the unblocked list are threads that were
  unblocked in the last quantum, between this scheduling interrupt and the
  last, and so theoretically don't have priority over threads in the preempted
  list, which might have been unblocked the quantum before, etc?
   -That doesn't quite explain the 'first thread from the (empty) unblocked
  list' thing though...

A: I don't understand the paper's discussion of scheduling queues.


Q: What are some of the limitations of the types of applications that can be
built on top of Singularity, given the many restrictions imposed by the system?

A: The main limitation is that applications have to be written in Sing#.
They can't be written in C, for example. So Singularity cannot run any
of the actual applications I currently use.

If one is willing to re-write applications in Sing#, I'm not aware of
any serious limitations. You have to live with Sing#'s rules (e.g. type
safety), but they seem no more limiting that those of many mainstream
languages (e.g. Java). Singularity doesn't support some O/S features
such as read/write shared memory; but that feature is a convenience or
performance trick, not a necessity for any application.


Q: With respect 2.3 (last line), Why is it inconvenient to handle error
conditions at send operations?  Wouldn't it be easier to track the cause of the
error and handle it at send operations than when it is received?

A: I do not understand this aspect of the paper.

uMaybe part of the justification is that all programs have to handle the case in
which they never get a response because the target died while handling the
request -- i.e. a receive error. So it's most convenient if programs have to
handle only this kind of error.