Support streaming the request() and requestUrl() response body

Use case or problem

The request() and requestUrl() APIs are used to make requests that bypass CORS restrictions.

I’m using those APIs in order to fetch data from HTTP endpoints that don’t set the CORS headers e.g. different LLM providers HTTP endpoints. Often these endpoints stream the content back in chunks, that is extremely useful to visualize before the entire request is completed.

Proposed solution

I propose to return a readable stream similarly to the native javascript fetch() function. In this proposal the request() returned promise resolves after the headers are available, and the body is exposed to the caller as a stream rather than waiting for the full body to be received. If the new interface results not backward compatible a requestUrlStream() could be introduced.

Current workaround

Unfortunately currently there’s no workaround, since mobile Obsidian plugin’s only way to make CORS http requests is using the request() and requestUrl() APIs given the absence of the Nodejs server. It’s not currently possible to stream a CORS http fetch result made with request() and requestUrl().

13 Likes

+1 for this. Streaming is the standard for LLMs, please support streaming with requestUrl, it will greatly enhance every Obsidian plugin using LLMs!

6 Likes

+1 for this. I used node-fetch as a replacement., but desktop-only :sweat_smile:

+1 would love to get support for LLM streamed responses!

Do you guys have an example of how it is currently accomplished using fetch()? Sample code of how you would be using it would be great.

Maybe take a look at this document: Using the Fetch API
A simple example looks like this:

        async function streamFetchWithHeadersAndBody(url, options) {
            const response = await fetch(url, options);
            if (!response.ok) {
                throw new Error(`HTTP error ${response.status}`);
            }

            const reader = response.body.getReader();
            const decoder = new TextDecoder();
            //let accumulatedData = new Uint8Array();//save all values
            try {
                while (true) {
                    const { done, value } = await reader.read();
                    if (done) break;
                    //accumulatedData = new Uint8Array([...accumulatedData, ...value]);
                    var text = decoder.decode(value, { stream: true })
                    console.log(text); //or call other method
                }
            } finally {
                reader.releaseLock();
            }
            //console.log(decoder.decode(accumulatedData));//Get all values
        }

        const url = 'http://example.com/streaming-endpoint';
        const options = {
            method: 'POST',
            headers: {
                'Content-Type': 'application/json',
            },
            body: JSON.stringify({
                "stream": true,
                //...
            }),
        };
        streamFetchWithHeadersAndBody(url, options);

I’m using the CORS feature of the Copilot plugin and can’t stream, I really hope this feature will be implemented!

Yes I saw the code samples from MDN as well. I just want to know how people will be actually using them with LLMs.

I don’t think it’s much different than the example, just change the parameters and address.

Here I’ve run a local open source model with LMStudio and it’s code like this. You can replace the url and api_key and model with the online model. The formatting all follows the openai API specification.

<html>

<head>
</head>

<body>
    <script>
        async function streamFetchWithHeadersAndBody(url, options) {
            const response = await fetch(url, options);
            if (!response.ok) {
                throw new Error(`HTTP error ${response.status}`);
            }

            const reader = response.body.getReader();
            const decoder = new TextDecoder();

            try {
                while (true) {
                    const { done, value } = await reader.read();
                    if (done) break;
                    var text = decoder.decode(value, { stream: true })

                    //deal with the text here
                    if (text.includes("[DONE]")) break;
                    var data = JSON.parse(text.replace('data:', ''));
                    var content = data.choices[0].delta.content;
                    console.log(content);

                    var newText = document.createTextNode(content);
                    document.body.appendChild(newText);
                }
            } finally {
                reader.releaseLock();
            }
        }

        const url = 'http://localhost:1234/v1/chat/completions';
        const options = {
            method: 'POST',
            headers: {
                'Content-Type': 'application/json', 
                //"Authorization": "Bearer {api_key}"
            },
            body: JSON.stringify({
                "model": "qwen2.5-0.5b-instruct-q8_0.gguf",
                "messages": [{ "role": "user", "content": "hello?" }],
                "stream": true
            }),
        };
        streamFetchWithHeadersAndBody(url, options);
    </script>
</body>

</html>