Hacker Newsnew | past | comments | ask | show | jobs | submit | chemag's commentslogin

I'm actually very interested in the details. I have a few questions, in case you have cycles to help me understand.

# audio latency

From your comment and the original "An update on Android's audio latency" article, there seems to be 2 different ways to calculate "audio latency".

* 1. play a well-known sound (a tone), listen for it in real-time

IIUC, the exact operation of this would be something like this:

  open recorder
  loop:
    timestamp1 <- getclock()
    play audio file 
    keep listening to input frames until detecting audio file signature
    timestamp2 <- getclock()
    sample := timestamp2 - timestamp1

* 2. measure audio loopback feedback (larsen effect)

Operation should be as follows:

  open recorder
  play audio file
  loop:
    keep listening to input frames for 1-2 seconds
  # analyze captured file
  look for 2x consecutive appearances of the original audio file
  sample := timestamp2 - timestamp1

Q1: Did I get this right? Any hunch about the pros&cons of both approaches?

# downlink/uplink latency

Many of the latency measurements break the latency down between audio downlink and uplink.

Q2: How is this measured?

The android doc discusses using a GPIO to have a zero-latency signal. You swap the signal on a well-known GPIO, and then play a well-known audio file. You then connect the GPIO to a speaker, and measure the latency between the GPIO-fed speaker, and the actual device speaker. As an optimization, you plug both the GPIO and the audio jack that feeds the actual device speaker into an oscilloscope, and get the distance there. Is this how audio downlink latency is measured?

Q3: If this is how downlink latency is measured, how do you implement it in prod devices (no access to the board)? Is this something that can be simulated using the usb-c plug?

# tools

The android doc/some googling points to several tools. I found oboetester, splatency, drrickorang, and google walt. I tested all but the walt one.

The first interesting thing is that oboetest, drrickorang, and walt seem to need an external jack-based dongle.

Q3: Why is this needed for? The audio latency approaches mentioned above should only need a speaker and a mic.

The experience has not been great for any of the tools. Operationally, I'd like to script these tests. Instead, I need to click on GUIs. Also, all the tools I tested seem to fail often, so I have to repeat the experiments multiple times. Also, I miss better documentation on exactly what they do.

Thanks for any help

edit: s/markdown/formatdoc/g


Below is a collaborative answer from myself and my colleague, Phil Burk who is a SWE on the audio framework team.

Thanks for your interest in the details of Android Latency measurement techniques.

Some of the measurements were collected by third parties. We cannot describe their techniques but they are probably similar to ours.

We use OboeTester to measure latency. You can find a description of how to measure Tap-to-Tone Latency and Round Trip Latency in this doc: https://github.com/google/oboe/blob/master/apps/OboeTester/d...

We do not use the Larsen Effect any more because it was too sensitive to variations in gain. We now use a random encoded bit stream that sounds like a short noise burst. We can get a better correlation peak with that signal.

> Many of the latency measurements break the latency down between > audio downlink and uplink.

It is very hard to separate the input and output latency without special hardware (like the WALT device). You can measure combined input+output latency using a loopback test. Input latency tends to be much lower than output latency. This is because, when the full duplex stream is stable, the input buffer is close to empty and the output buffer is close to full. Then, if there is a preemption, the input buffer fills up and the output buffer drains, providing glitch protection.

You can measure touch+output latency using tap-to-tone. The screen touch latency is about 15-30 msec. If you use a hardware MIDI controller (like a keyboard or drum pad) instead of tapping the touch screen then you can get a “touch” latency of about 1 msec (MIDI is a lightweight and low latency protocol). Then you can get a better estimate of just the output latency.

> The android doc discusses using a GPIO to have a zero-latency signal.

We don’t normally use that technique because it requires special hardware.

> I found oboetester, splatency, drrickorang, and google walt.

Our group supports OboeTester.

> The first interesting thing is that oboetest, drrickorang, and walt seem to need an external jack-based dongle.

The lowest latency path is usually over the headphone jack (either through 3.5mm or USB dongle). To test this path you do need a "jack-based dongle" aka loopback adapter.

The reason why wired headphones usually give you lower latency is that OEMs often introduce digital signal processing for the speaker to improve the acoustics/quality, which can introduce additional latency. Side note: this is why you will often see "best with headphones" on games and music apps.

That said, OboeTester will work over the speakers and mic in a quiet room. That is because the new random bit technique is more robust.

> I'd like to script these tests.

OboeTester can be scripted. We use it for continuous integration testing. https://github.com/google/oboe/blob/master/apps/OboeTester/d...


Thanks Don and Phil for the great answer. Some further comments:

* I find interesting that you are getting better correlations when you add a well-known noise burst instead of a less chaotic signal (e.g. a tone or a chirp). In retrospect, it makes sense

* for the uplink/downlink breakup, I see in the Usage.md file in oboetester that you're isolating the downlink measurement with the tap-to-tone experiment. For this experiment, the doc suggests to use (a) the jack to avoid the speaker processing extra latency, and (b) a USB-MIDI input device to replace the touch screen latency (15-30 ms).

2 questions here:

* Q1: I assume that, if instead of the jack, you use a usb-c audio adapter accessory mode, there should be no extra latency either, right? (I'm using late pixel phones)

* Q2: which device are you using for the USB-MIDI input?

Thanks again!

edit: s/markdown/formatdoc/g


> * Q1: I assume that, if instead of the jack, you use a usb-c audio adapter accessory mode, there should be no extra latency either, right? (I'm using late pixel phones)

That's correct, by using either the 3.5mm jack, or USB-C adapter you won't incur any additional latency introduced by DSP to improve the speaker acoustics.

However, the USB path typically does have a few ms higher latency than the 3.5mm jack path.

> * Q2: which device are you using for the USB-MIDI input?

We've used a variety of devices and found the latency differences to be negligible. At the moment I test with an old AKAI LPK25.


I had the same thought. My hunch is that the difference is friction. In both cases, there's money going from the Feds to the University (100% probability) and money going from the student to the Feds (<100% prob).

<1993: there's a bank in the middle. If the student cannot pay, the bank has to do the paperwork to get paid by the Feds. Note that the profit for the bank is limited.

>1993: no bank in the middle. University gets paid right away. Zero risk for them


Yes, presumably Uncle Sam cares less about making a sound loan in the first place. It is also able to create money out of thin air—in other words, (symbolically) infinite resources.

Getting paid upfront, instead of years later after tons of paperwork, can only accelerate the process.


Live-streaming is typically a type of broadcast. Some latency (a few or even tens of seconds) is typically OK.


True, how about caling it "soft real-time"? It can't stutter too often, and can't be too resource intensive as that would take them away from whatever task is being live-streamed (games, live-drawing in photoshop with heavy filters, etc), and there is an upper limit of acceptable latency.


+1 to this. QUIC allows using CUBIC BBR [1], so a comparison based on the exact parameters used is actually comparing the exact parameters used.

The performance effects of QUIC implementing congestion control in userland are more interesting. OTOH, QUIC allows deploying new features to users (through cronet) in an efficient way. TCP does not.

[1] https://chromium.googlesource.com/chromium/src/net/+/master/...

> [Disclaimer: I've worked with some of the people who wrote BBR and QUIC, so I'm biased.] Ditto


I've been trying for a while to understand what org-mode gives you. I saw Dominik's tech talk at Google, and Bieber's discussion on dropping vim with emacs. I really like the approach taken in this article, showing some of the language features.

IIUC, what org-mode provides you is:

1. a (markdown-like) lightweight document markup language, with lots of syntax hooks ("#+") for different tools.

2. some (even lighter, i.e., no "#+" required) organization-based syntax hooks. These are the TODO/DONE/... labels (plus the "[ ]" tidbits), the table syntax, the metadata (e.g. AUTHOR). In fact, the idea of adding metadata to a lightweight markup language is very interesting.

3. some "programmy" syntax items, including things like tags, spreadsheet-like tables, properties, etc.

4. the agenda view. This is a horizontal search on multiple .org files to create a work agenda.

5. some emacs functionality related to automatic recognition and operation on some of the syntax items. For example, org-table-align will "Re-align the table and don't move to another field".

There are lots of other features, but nothing that other lightweight markup languages don't/can't have too.

My main concerns are:

1. it is inextricably tied to emacs. AFAICT, only (5) in the previous list is emacs-only. All the other functionalities are related to the markup syntax.

2. I wish the org-mode language was fully markdown compatible (I can barely remember the syntax of one, and now I need to use 2).


You are very right that other lightweight markup languages could provide syntax for everything that org-mode does. But what would possibly interpret that syntax?

The fact that org-mode is tied to Emacs is both its weakness but also its strength. By sitting on the Emacs interpreter, org-mode imposes no limits to what you can achieve. The synergy with other Emacs packages is enormous.

You also forgot to mention one of the coolest features of Org: exporting your documents to whatever format you may want, e.g. html, latex, markdown, odt, reveal.js, - and it's possible to hack the export to fit your needs usually with modest effort. Here's an example of workflow to collaborate with Word users, for example: https://lists.gnu.org/archive/html/emacs-orgmode/2015-06/msg...

And here a recent blog post on exporting org to jupyter notebooks: http://kitchingroup.cheme.cmu.edu/blog/2017/01/21/Exporting-...


I can see some of the benefits of the integration: emacs can do the horizontal search for TODO entries, create an agenda, let you edit it, and move the changes to the right .org file. Still, I'd argue that the file is the important thing here, not the tool to manage it. I should be able to edit .org files with any other tool (e.g. an android app that eventually modifies a .org file in github), and then do re-process the file.

Other languages also allow exporting to other formats. I use md-to-pdf all the time, and it works well. Integrating the use of other tools in the export process should be independent of the usage of emacs.


There is lots of functionality you can't do without some kind of non-ordinary editor functionality. The table functionality comes to mind, as does the time management functionality. But, really org-mode's syntax is simple and close enough to MD that it's very easy to implement in other editors. There is a vim package and even an Intellij package. But I challenge you to start implementing the non-syntax parts in another editor and you will quickly see that there is a lot of it.


> I should be able to edit .org files [...] and then re-process the file.

You can. It's a plain text file; Emacs provides a "live" and highly interactive UI for it, but nothing prevents you editing elsewhere and invoking Emacs, in "batch" (i.e. UI-less pure language interpreter) mode if you like, to evaluate, export, or perform whatever other actions on the file. I do this all the time, albeit in an interactive Emacs, with Editorial on my phone and files synced in Dropbox.

There are also libraries in languages other than elisp which provide some export functionality; notably, Github uses org-ruby, I think it's called, to HTMLify org files for web UI rendering.


Well, you can edit it in any text editor. You just don't get the automatic formatting, coloring, table calculations and so on that all depend on elisp.


> The synergy with other Emacs packages is enormous.

I've been experimenting toward a modern SQL interaction mode, and an Org buffer makes a brilliant place for output - native handling of tabular data, syntax-highlighted code blocks each with the query that produced a given result, headlines, plenty of place for user-added annotations, built-in export, overall a ton of work that I don't have to do.


I think a fruitful way of looking at what org mode gives you as... a way to pull together many organizational aspects of your life into one system, as opposed to having 5 different apps that have boundaries between them.

Doing an outline for that presentation next week, but need to put it down now, and don't want to forget about it? Schedule a reminder or due date right there in the document and it'll show up in your agenda. No switching over to a reminder app. Want to log time spent on it? Do it right there. In the middle of all that and get a phone call with a new task? You're a hotkey away from recording that info without leaving what you're working on.

For me it's the integration of all things organizational that makes it compelling.


The biggest thing, to me, is Babel: it's a system that lets you combine documentation and code into a single unit, and lets you combine code from many different languages into a single tool for processing. Now, instead of an unintelligible readme with an undocumented handful of scripts, I've actually got real documentation and an entire project basically in a single file.

http://www.jstatsoft.org/v46/i03

I've been thinking of starting a project specifically in org-mode for quite a while.


I'm curious which are the use cases for sub-microsecond time sync. Entertainment can probably do with millisecond accuracy. Other use cases mentioned by the Wifi Alliance preso are healthcare, industrial, automotive, and IoT. Do you know any precise use cases?



I should have mention this only applies to IBSS (ad-hoc), not to (infra) BSS, where the AP sends the beacon


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: