I'm so amazed to find out just how close we are to the start trek voice computer.
I used to use Dragon Dictation to draft my first novel, had to learn a 'language' to tell the rudimentary engine how to recognize my speech.
And then I discovered [1] and have been using it for some basic speech recognition, amazed at what a local model can do.
But it can't transcribe any text until I finish recording a file, and then it starts work, so very slow batches in terms of feedback latency cycles.
And now you've posted this cool solution which streams audio chunks to a model in infinite small pieces, amazing, just amazing.
Now if only I can figure out how to contribute to Handy or similar to do that Speech To Text in a streaming mode, STT locally will be a solved problem for me.
Happy to answer questions about this (or work with people on further optimizing the open source inference code here). NVIDIA has more inference tooling coming, but it's also fun to hack on the PyTorch/etc stuff they've released so far.
Thank you for sharing! Does your implementation allow running the Nemotron model on Vulkan? Like whisper.cpp? I'm curious to try other models, but I don't have Nvidia, so my choices are limited.
It's an artifact of the camera. The camera shutter is long enough that it averages the images over 33ms.
At some point in the video you can see that a high speed camera can see the correct display.
At the 7 minute mark the industrial 14k FPS camera shows essentially zero rollover. The earlier rollover does appear to be an artifact of the cheap consumer grade high speed camera used.
when it comes to real people, they get sued into oblivion for downloading copyrighted content, even for the purpose of learning.
but when facebook & openai do it, at a much larger scale, suddenly the laws must be changed.
Swartz wasn’t “downloading copyrighted content…for the purpose of learning,” he was downloading with the intent to distribute. That doesn’t justify how he was treated. But it’s not analogous to the limited argument for LLMs that don’t regurgitate the copyrighted content.
This is not about memory or training. The LLM training process is not being run on books streamed directly off the internet or from real-time footage of a book.
What these companies are doing is:
1. Obtain a free copy of a work in some way.
2. Store this copy in a format that's amenable to training.
3. Train their models on the stored copy, months or years after step 1 happened.
The illegal part happens in steps 1 and/or 2. Step 3 is perhaps debatable - maybe it's fair to argue that the model is learning in the same sense as a human reading a book, so the model is perhaps not illegally created.
But the training set that the company is storing is full of illegally obtained or at least illegally copied works.
What they're doing before the training step is exactly like building a library by going with a portable copier into bookshops and creating copies of every book in that bookshop.
But making copies for yourself, without distributing them, is different than making copies for others. Google is downloading copyrighted content from everywhere online, but they don't redistribute their scraped content.
Even web browsing implies making copies of copyrighted pages, we can't tell the copyright status of a page without loading it, at which point a copy has been made in memory.
Making copies of an original you don't own/didn't obtain legally is not fair use. Also, this type of personal copying doesn't apply to corporations making copies to be distributed among their employees (it might apply to a company making a copy for archival, though).
> when it comes to real people, they get sued into oblivion for downloading copyrighted content, even for the purpose of learning.
Really? Or do they get sued for sharing as in republishing without transformation? Arguably a URL providing copyrighted content, is you offering a xerox machine.
It seems most "sued into oblivion" are the reshare problem, not the get one for myself problem.
From my observations: cold start, ease of patching.
If you're running a lot of different JS code or restarting the code frequently, it's faster than node.
Where it's useful: fuzzing. If you have a library/codebase you want to fuzz, you need to restart the code from a snapshot, and other engines seem to do it slower.
It's also really easy to patch the code, because of the codebase size. If you need to trace/observe some behavior, just do it.
Salesforce sandboxing is too easy to escape. Last time I needed to implement some feature for Salesforce, I've encountered 4 different escapes. It was also horrible dev experience.
It's not about being poor.
First, the climate didn't require AC in most of the Europe, until ~10 years ago. You had a few hot days, and that's it.
Second, thermal isolation in the US is extremely bad quality. I think people could cut their AC usage by half if they had proper thermal isolation in their houses.
Third, northern Europe countries still don't have a climate to justify buying an AC.
Specifically, American houses lack thermal mass due to being constructed mainly from wood. Concrete and brick will buffer over a week or so of heat before it warms up too much.
In Florida, most of the homes are built from concrete brick with wood trusses. There are apartments made from wood and concrete.
It’s not the heat completely - it is also the humidity. You can bear up to 80 F before it starts to feel uncomfortable. Humidity will make even 75F uncomfortable.
Relative humidity isn't a great indicator of comfort. It's better to look at dew point. The Netherlands is not only cooler on average but also has a lower dew point. This shouldn't be surprising given each country's latitudes.
Both regions have high humidity, but Florida tends to have higher average humidity levels, particularly in the summer months. Florida has a subtropical to tropical climate, characterized by high temperatures and humidity, especially in the summer. Florida experiences high humidity levels throughout the year, often ranging from 70% to 90%. Summer months are particularly humid, with frequent afternoon thunderstorms.
The Netherlands has a temperate maritime climate, influenced by the North Sea.
Florida and the Netherlands are not close in comparison.
I’m sorry, but it is just mind boggling to suggest that Netherlands and Florida have comparable weather in any sense. You wouldn’t suggest that the weather in Netherlands is as hot as in, say, Italy or Greece, and Florida is even hotter than these two.
I'm not saying it's as hot here as it is in Florida. But we've been breaking records left and right up to the point where I've purchased an AC (a crappy mobile one for lack of better options here for rented apartments) because we go through months every summer now where I can barely sleep without one anymore.
My point was that people often don't realize how humid it is here. You apparently also can't believe it. And how our buildings are not made to keep heat out, but rather in. So I expect many more ACs to be sold here as well in the coming years.
It might just be a month or two each year. And it might be worse for you. But it's also getting pretty bad here already thanks to climate change. And that's not going to improve anytime soon thanks to all of us.
Yes, mostly by using insulating (double) glass to let warmth in in the form of light that then warms up the interior. Think greenhouses. Surround that with poorly insulated walls and limited ventilation and in cold weather they'll leak out heat while in warm weather they'll also heat up in the sun and radiate that in.
Any home with an ACH nat of 1 that's attempting to condition the air (heating or cooling) is wasting a mind boggling percentage of the energy. Surely that's not the natural ventilation rate of the _typical_ home? That would imply that 50% of homes are worse.
Artificial neural networks work the following way: you have a bunch of “neurons” which have inputs and an output. Neuron’s inputs have weights associated with them, the larger the weight, the more influence the input has on the neuron. These weights need to be represented in our computers somehow, usually people use IEEE754 floating point numbers. But these numbers take a lot of space (32 or 16 bits).
So one approach people have invented is to use more compact representation of these weights (10, 8, down to 2 bits). This process is called quantisation. Having a smaller representation makes running the model faster because models are currently limited by memory bandwidth (how long it takes to read weights from memory), going from 32 bits to 2 bits potentially leads to 16x speed up. The surprising part is that the models still produce decent results, even when a lot of information from the weights was “thrown away”.
Not a browser, but a PWA. It's a web page, which you can "install" as an "app". Features like storage, background tasks and notifications are important for many applications, for example a messenger. These were available, and there is a market for those, but Apple has decided to kill that market.
https://huggingface.co/nvidia/nemotron-speech-streaming-en-0...
https://github.com/m1el/nemotron-asr.cpp https://huggingface.co/m1el/nemotron-speech-streaming-0.6B-g...
reply