当前位置：首页 > news >正文

spring-ai-openai调用Xinference1.4.1报错

news 来源：原创 2025/8/22 18:31:43

1、Xinference 报错logs

此处是调用 /v1/chat/completions 接口

2025-04-06 15:48:51 xinference | return await dependant.call(**values)
2025-04-06 15:48:51 xinference | File "/usr/local/lib/python3.10/dist-packages/xinference/api/restful_api.py", line 1945, in create_chat_completion
2025-04-06 15:48:51 xinference | raw_body = await request.json()
2025-04-06 15:48:51 xinference | File "/usr/local/lib/python3.10/dist-packages/starlette/requests.py", line 252, in json
2025-04-06 15:48:51 xinference | self._json = json.loads(body)
2025-04-06 15:48:51 xinference | File "/usr/lib/python3.10/json/init.py", line 346, in loads
2025-04-06 15:48:51 xinference | return _default_decoder.decode(s)
2025-04-06 15:48:51 xinference | File "/usr/lib/python3.10/json/decoder.py", line 337, in decode
2025-04-06 15:48:51 xinference | obj, end = self.raw_decode(s, idx=_w(s, 0).end())
2025-04-06 15:48:51 xinference | File "/usr/lib/python3.10/json/decoder.py", line 355, in raw_decode
2025-04-06 15:48:51 xinference | raise JSONDecodeError("Expecting value", s, err.value) from None
2025-04-06 15:48:51 xinference | json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

2、使用python openai客户端调用正常

3、使用Wireshark 抓包，发现问题

openai 调用抓包如下

Hypertext Transfer Protocol
POST /v1/chat/completions HTTP/1.1\r\n
Request Method: POST
Request URI: /v1/chat/completions
Request Version: HTTP/1.1
Host: localhost:9997\r\n
Accept-Encoding: gzip, deflate\r\n
Connection: keep-alive\r\n
Accept: application/json\r\n
Content-Type: application/json\r\n
User-Agent: OpenAI/Python 1.70.0\r\n
X-Stainless-Lang: python\r\n
X-Stainless-Package-Version: 1.70.0\r\n
X-Stainless-OS: Windows\r\n
X-Stainless-Arch: other:amd64\r\n
X-Stainless-Runtime: CPython\r\n
X-Stainless-Runtime-Version: 3.11.9\r\n
Authorization: Bearer not empty\r\n
X-Stainless-Async: false\r\n
x-stainless-retry-count: 0\r\n
x-stainless-read-timeout: 600\r\n
Content-Length: 95\r\n
\r\n
[Response in frame: 61]
[Full request URI: http://localhost:9997/v1/chat/completions]
File Data: 95 bytes
JavaScript Object Notation: application/jsonJSON raw form:{"messages": [{"content": "你是谁","role": "user"}],"model": "qwen2-instruct","max_tokens": 1024}

spring-ai调用抓包如下

Hypertext Transfer Protocol, has 2 chunks (including last chunk)
POST /v1/chat/completions HTTP/1.1\r\n
Request Method: POST
Request URI: /v1/chat/completions
Request Version: HTTP/1.1
Connection: Upgrade, HTTP2-Settings\r\n
Host: 192.168.3.100:9997\r\n
HTTP2-Settings: AAEAAEAAAAIAAAAAAAMAAAAAAAQBAAAAAAUAAEAAAAYABgAA\r\n
Settings - Header table size : 16384
Settings Identifier: Header table size (1)
Header table size: 16384
Settings - Enable PUSH : 0
Settings Identifier: Enable PUSH (2)
Enable PUSH: 0
Settings - Max concurrent streams : 0
Settings Identifier: Max concurrent streams (3)
Max concurrent streams: 0
Settings - Initial Windows size : 16777216
Settings Identifier: Initial Windows size (4)
Initial Window Size: 16777216
Settings - Max frame size : 16384
Settings Identifier: Max frame size (5)
Max frame size: 16384
Settings - Max header list size : 393216
Settings Identifier: Max header list size (6)
Max header list size: 393216
Transfer-encoding: chunked\r\n
Upgrade: h2c\r\n
User-Agent: Java-http-client/17.0.14\r\n
Authorization: Bearer not empty\r\n
Content-Type: application/json\r\n
\r\n
[Full request URI: http://192.168.3.100:9997/v1/chat/completions]
HTTP chunked response
File Data: 143 bytes
JavaScript Object Notation: application/jsonJSON raw form:{"messages": [{"content": "你好，介绍下你自己！","role": "user"}],"model": "qwen2-instruct","stream": false,"temperature": 0.7,"top_p": 0.7}

发现问题，spring-ai 升级为Http2了，百度下貌似 Xinference 不支持 http2

代码debug

发现

OpenAiChatModel类的  OpenAiApi使用了

import org.springframework.web.client.RestClient;
import org.springframework.web.reactive.function.client.WebClient;private final RestClient restClient;private final WebClient webClient;

这两个Http请求的使用的是jdk自带的 jdk.internal.net.http.HttpClientImpl，默认会使用http2

4、修改 OpenAiApi 类，下载spring-ai 源码，找到

spring-ai-openai

路径如下

\spring-ai\models\spring-ai-openai

OpenAiApi.java改动后如下

/** Copyright 2023-2025 the original author or authors.** Licensed under the Apache License, Version 2.0 (the "License");* you may not use this file except in compliance with the License.* You may obtain a copy of the License at**      https://www.apache.org/licenses/LICENSE-2.0** Unless required by applicable law or agreed to in writing, software* distributed under the License is distributed on an "AS IS" BASIS,* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.* See the License for the specific language governing permissions and* limitations under the License.*/package org.springframework.ai.openai.api;import java.util.List;
import java.util.Map;
import java.util.concurrent.atomic.AtomicBoolean;
import java.util.function.Consumer;
import java.util.function.Predicate;import com.fasterxml.jackson.annotation.JsonFormat;
import com.fasterxml.jackson.annotation.JsonIgnore;
import com.fasterxml.jackson.annotation.JsonIgnoreProperties;
import com.fasterxml.jackson.annotation.JsonInclude;
import com.fasterxml.jackson.annotation.JsonInclude.Include;
import com.fasterxml.jackson.annotation.JsonProperty;
import reactor.core.publisher.Flux;
import reactor.core.publisher.Mono;import org.springframework.ai.model.ApiKey;
import org.springframework.ai.model.ChatModelDescription;
import org.springframework.ai.model.ModelOptionsUtils;
import org.springframework.ai.model.NoopApiKey;
import org.springframework.ai.model.SimpleApiKey;
import org.springframework.ai.openai.api.common.OpenAiApiConstants;
import org.springframework.ai.retry.RetryUtils;
import org.springframework.core.ParameterizedTypeReference;
import org.springframework.http.HttpHeaders;
import org.springframework.http.MediaType;
import org.springframework.http.ResponseEntity;
import org.springframework.util.Assert;
import org.springframework.util.CollectionUtils;
import org.springframework.util.LinkedMultiValueMap;
import org.springframework.util.MultiValueMap;
import org.springframework.web.client.ResponseErrorHandler;
import org.springframework.web.client.RestClient;
import org.springframework.web.reactive.function.client.WebClient;
import okhttp3.ConnectionPool;
import okhttp3.OkHttpClient;
import org.springframework.http.client.ClientHttpRequestFactory;
import org.springframework.http.client.OkHttp3ClientHttpRequestFactory;
import java.util.concurrent.TimeUnit;
import java.time.Duration;
import io.netty.channel.ChannelOption;
import reactor.netty.http.client.HttpClient;
import org.springframework.http.client.reactive.ReactorClientHttpConnector;/*** Single class implementation of the* <a href="https://platform.openai.com/docs/api-reference/chat">OpenAI Chat Completion* API</a> and <a href="https://platform.openai.com/docs/api-reference/embeddings">OpenAI* Embedding API</a>.** @author Christian Tzolov* @author Michael Lavelle* @author Mariusz Bernacki* @author Thomas Vitale* @author David Frizelle* @author Alexandros Pappas*/
public class OpenAiApi {public static Builder builder() {return new Builder();}public static final OpenAiApi.ChatModel DEFAULT_CHAT_MODEL = ChatModel.GPT_4_O;public static final String DEFAULT_EMBEDDING_MODEL = EmbeddingModel.TEXT_EMBEDDING_ADA_002.getValue();private static final Predicate<String> SSE_DONE_PREDICATE = "[DONE]"::equals;private final String completionsPath;private final String embeddingsPath;private final RestClient restClient;private final WebClient webClient;private OpenAiStreamFunctionCallingHelper chunkMerger = new OpenAiStreamFunctionCallingHelper();/*** Create a new chat completion api.* @param baseUrl api base URL.* @param apiKey OpenAI apiKey.* @param headers the http headers to use.* @param completionsPath the path to the chat completions endpoint.* @param embeddingsPath the path to the embeddings endpoint.* @param restClientBuilder RestClient builder.* @param webClientBuilder WebClient builder.* @param responseErrorHandler Response error handler.*/public OpenAiApi(String baseUrl, ApiKey apiKey, MultiValueMap<String, String> headers, String completionsPath,String embeddingsPath, RestClient.Builder restClientBuilder, WebClient.Builder webClientBuilder,ResponseErrorHandler responseErrorHandler) {Assert.hasText(completionsPath, "Completions Path must not be null");Assert.hasText(embeddingsPath, "Embeddings Path must not be null");Assert.notNull(headers, "Headers must not be null");this.completionsPath = completionsPath;this.embeddingsPath = embeddingsPath;// @formatter:offConsumer<HttpHeaders> finalHeaders = h -> {if(!(apiKey instanceof NoopApiKey)) {h.setBearerAuth(apiKey.getValue());}h.setContentType(MediaType.APPLICATION_JSON);h.addAll(headers);};OkHttpClient okHttpClient = new OkHttpClient.Builder().connectTimeout(120, TimeUnit.SECONDS)   // 连接超时.readTimeout(120, TimeUnit.SECONDS)   // 读取超时.connectionPool(new ConnectionPool(100, 10, TimeUnit.MINUTES)).build();ClientHttpRequestFactory requestFactory = new OkHttp3ClientHttpRequestFactory(okHttpClient);this.restClient = restClientBuilder.baseUrl(baseUrl).defaultHeaders(finalHeaders).requestFactory(requestFactory).defaultStatusHandler(responseErrorHandler).build();// 1. 创建 Reactor Netty 的 HttpClient 实例HttpClient reactorHttpClient = HttpClient.create().responseTimeout(Duration.ofSeconds(1000))      // 响应超时配置.option(ChannelOption.CONNECT_TIMEOUT_MILLIS, 100000); // 连接超时配置ReactorClientHttpConnector clientHttpConnector = new ReactorClientHttpConnector(reactorHttpClient);this.webClient = webClientBuilder.clientConnector(clientHttpConnector).baseUrl(baseUrl).defaultHeaders(finalHeaders).build(); // @formatter:on}public static String getTextContent(List<ChatCompletionMessage.MediaContent> content) {return content.stream().filter(c -> "text".equals(c.type())).map(ChatCompletionMessage.MediaContent::text).reduce("", (a, b) -> a + b);}/*** Creates a model response for the given chat conversation.* @param chatRequest The chat completion request.* @return Entity response with {@link ChatCompletion} as a body and HTTP status code* and headers.*/public ResponseEntity<ChatCompletion> chatCompletionEntity(ChatCompletionRequest chatRequest) {return chatCompletionEntity(chatRequest, new LinkedMultiValueMap<>());}/*** Creates a model response for the given chat conversation.* @param chatRequest The chat completion request.* @param additionalHttpHeader Optional, additional HTTP headers to be added to the* request.* @return Entity response with {@link ChatCompletion} as a body and HTTP status code* and headers.*/public ResponseEntity<ChatCompletion> chatCompletionEntity(ChatCompletionRequest chatRequest,MultiValueMap<String, String> additionalHttpHeader) {Assert.notNull(chatRequest, "The request body can not be null.");Assert.isTrue(!chatRequest.stream(), "Request must set the stream property to false.");Assert.notNull(additionalHttpHeader, "The additional HTTP headers can not be null.");return this.restClient.post().uri(this.completionsPath).headers(headers -> headers.addAll(additionalHttpHeader)).body(chatRequest).retrieve().toEntity(ChatCompletion.class);}/*** Creates a streaming chat response for the given chat conversation.* @param chatRequest The chat completion request. Must have the stream property set* to true.* @return Returns a {@link Flux} stream from chat completion chunks.*/public Flux<ChatCompletionChunk> chatCompletionStream(ChatCompletionRequest chatRequest) {return chatCompletionStream(chatRequest, new LinkedMultiValueMap<>());}/*** Creates a streaming chat response for the given chat conversation.* @param chatRequest The chat completion request. Must have the stream property set* to true.* @param additionalHttpHeader Optional, additional HTTP headers to be added to the* request.* @return Returns a {@link Flux} stream from chat completion chunks.*/public Flux<ChatCompletionChunk> chatCompletionStream(ChatCompletionRequest chatRequest,MultiValueMap<String, String> additionalHttpHeader) {Assert.notNull(chatRequest, "The request body can not be null.");Assert.isTrue(chatRequest.stream(), "Request must set the stream property to true.");AtomicBoolean isInsideTool = new AtomicBoolean(false);return this.webClient.post().uri(this.completionsPath).headers(headers -> headers.addAll(additionalHttpHeader)).body(Mono.just(chatRequest), ChatCompletionRequest.class).retrieve().bodyToFlux(String.class)// cancels the flux stream after the "[DONE]" is received..takeUntil(SSE_DONE_PREDICATE)// filters out the "[DONE]" message..filter(SSE_DONE_PREDICATE.negate()).map(content -> ModelOptionsUtils.jsonToObject(content, ChatCompletionChunk.class))// Detect is the chunk is part of a streaming function call..map(chunk -> {if (this.chunkMerger.isStreamingToolFunctionCall(chunk)) {isInsideTool.set(true);}return chunk;})// Group all chunks belonging to the same function call.// Flux<ChatCompletionChunk> -> Flux<Flux<ChatCompletionChunk>>.windowUntil(chunk -> {if (isInsideTool.get() && this.chunkMerger.isStreamingToolFunctionCallFinish(chunk)) {isInsideTool.set(false);return true;}return !isInsideTool.get();})// Merging the window chunks into a single chunk.// Reduce the inner Flux<ChatCompletionChunk> window into a single// Mono<ChatCompletionChunk>,// Flux<Flux<ChatCompletionChunk>> -> Flux<Mono<ChatCompletionChunk>>.concatMapIterable(window -> {Mono<ChatCompletionChunk> monoChunk = window.reduce(new ChatCompletionChunk(null, null, null, null, null, null, null, null),(previous, current) -> this.chunkMerger.merge(previous, current));return List.of(monoChunk);})// Flux<Mono<ChatCompletionChunk>> -> Flux<ChatCompletionChunk>.flatMap(mono -> mono);}/*** Creates an embedding vector representing the input text or token array.* @param embeddingRequest The embedding request.* @return Returns list of {@link Embedding} wrapped in {@link EmbeddingList}.* @param <T> Type of the entity in the data list. Can be a {@link String} or* {@link List} of tokens (e.g. Integers). For embedding multiple inputs in a single* request, You can pass a {@link List} of {@link String} or {@link List} of* {@link List} of tokens. For example:** <pre>{@code List.of("text1", "text2", "text3") or List.of(List.of(1, 2, 3), List.of(3, 4, 5))} </pre>*/public <T> ResponseEntity<EmbeddingList<Embedding>> embeddings(EmbeddingRequest<T> embeddingRequest) {Assert.notNull(embeddingRequest, "The request body can not be null.");// Input text to embed, encoded as a string or array of tokens. To embed multiple// inputs in a single// request, pass an array of strings or array of token arrays.Assert.notNull(embeddingRequest.input(), "The input can not be null.");Assert.isTrue(embeddingRequest.input() instanceof String || embeddingRequest.input() instanceof List,"The input must be either a String, or a List of Strings or List of List of integers.");// The input must not exceed the max input tokens for the model (8192 tokens for// text-embedding-ada-002), cannot// be an empty string, and any array must be 2048 dimensions or less.if (embeddingRequest.input() instanceof List list) {Assert.isTrue(!CollectionUtils.isEmpty(list), "The input list can not be empty.");Assert.isTrue(list.size() <= 2048, "The list must be 2048 dimensions or less");Assert.isTrue(list.get(0) instanceof String || list.get(0) instanceof Integer || list.get(0) instanceof List,"The input must be either a String, or a List of Strings or list of list of integers.");}return this.restClient.post().uri(this.embeddingsPath).body(embeddingRequest).retrieve().toEntity(new ParameterizedTypeReference<>() {});}/*** OpenAI Chat Completion Models.* <p>* This enum provides a selective list of chat completion models available through the* OpenAI API, along with their key features and links to the official OpenAI* documentation for further details.* <p>* The models are grouped by their capabilities and intended use cases. For each* model, a brief description is provided, highlighting its strengths, limitations,* and any specific features. When available, the description also includes* information about the model's context window, maximum output tokens, and knowledge* cutoff date.* <p>* <b>References:</b>* <ul>* <li><a href="https://platform.openai.com/docs/models#gpt-4o">GPT-4o</a></li>* <li><a href="https://platform.openai.com/docs/models#gpt-4-and-gpt-4-turbo">GPT-4* and GPT-4 Turbo</a></li>* <li><a href="https://platform.openai.com/docs/models#gpt-3-5-turbo">GPT-3.5* Turbo</a></li>* <li><a href="https://platform.openai.com/docs/models#o1-and-o1-mini">o1 and* o1-mini</a></li>* <li><a href="https://platform.openai.com/docs/models#o3-mini">o3-mini</a></li>* </ul>*/public enum ChatModel implements ChatModelDescription {/*** <b>o1</b> is trained with reinforcement learning to perform complex reasoning.* It thinks before it answers, producing a long internal chain of thought before* responding to the user.* <p>* The latest o1 model supports both text and image inputs, and produces text* outputs (including Structured Outputs).* <p>* The knowledge cutoff for o1 is October, 2023.* <p>*/O1("o1"),/*** <b>o1-preview</b> is trained with reinforcement learning to perform complex* reasoning. It thinks before it answers, producing a long internal chain of* thought before responding to the user.* <p>* The latest o1-preview model supports both text and image inputs, and produces* text outputs (including Structured Outputs).* <p>* The knowledge cutoff for o1-preview is October, 2023.* <p>*/O1_PREVIEW("o1-preview"),/*** <b>o1-mini</b> is a faster and more affordable reasoning model compared to o1.* o1-mini currently only supports text inputs and outputs.* <p>* The knowledge cutoff for o1-mini is October, 2023.* <p>*/O1_MINI("o1-mini"),/*** <b>o3-mini</b> is our most recent small reasoning model, providing high* intelligence at the same cost and latency targets of o1-mini. o3-mini also* supports key developer features, like Structured Outputs, function calling,* Batch API, and more. Like other models in the o-series, it is designed to excel* at science, math, and coding tasks.* <p>* The knowledge cutoff for o3-mini models is October, 2023.* <p>*/O3_MINI("o3-mini"),/*** <b>GPT-4o ("omni")</b> is our versatile, high-intelligence flagship model. It* accepts both text and image inputs and produces text outputs (including* Structured Outputs).* <p>* The knowledge cutoff for GPT-4o models is October, 2023.* <p>*/GPT_4_O("gpt-4o"),/*** The <b>chatgpt-4o-latest</b> model ID continuously points to the version of* GPT-4o used in ChatGPT. It is updated frequently when there are significant* changes to ChatGPT's GPT-4o model.* <p>* Context window: 128,000 tokens. Max output tokens: 16,384 tokens. Knowledge* cutoff: October, 2023.*/CHATGPT_4_O_LATEST("chatgpt-4o-latest"),/*** <b>GPT-4o Audio</b> is a preview release model that accepts audio inputs and* outputs and can be used in the Chat Completions REST API.* <p>* The knowledge cutoff for GPT-4o Audio models is October, 2023.* <p>*/GPT_4_O_AUDIO_PREVIEW("gpt-4o-audio-preview"),/*** <b>GPT-4o-mini Audio</b> is a preview release model that accepts audio inputs* and outputs and can be used in the Chat Completions REST API.* <p>* The knowledge cutoff for GPT-4o-mini Audio models is October, 2023.* <p>*/GPT_4_O_MINI_AUDIO_PREVIEW("gpt-4o-mini-audio-preview"),/*** <b>GPT-4o-mini</b> is a fast, affordable small model for focused tasks. It* accepts both text and image inputs and produces text outputs (including* Structured Outputs). It is ideal for fine-tuning, and model outputs from a* larger model like GPT-4o can be distilled to GPT-4o-mini to produce similar* results at lower cost and latency.* <p>* The knowledge cutoff for GPT-4o-mini models is October, 2023.* <p>*/GPT_4_O_MINI("gpt-4o-mini"),/*** <b>GPT-4 Turbo</b> is a high-intelligence GPT model with vision capabilities,* usable in Chat Completions. Vision requests can now use JSON mode and function* calling.* <p>* The knowledge cutoff for the latest GPT-4 Turbo version is December, 2023.* <p>*/GPT_4_TURBO("gpt-4-turbo"),/*** <b>GPT-4-0125-preview</b> is the latest GPT-4 model intended to reduce cases of* “laziness” where the model doesn’t complete a task.* <p>* Context window: 128,000 tokens. Max output tokens: 4,096 tokens.*/GPT_4_0125_PREVIEW("gpt-4-0125-preview"),/*** Currently points to {@link #GPT_4_0125_PREVIEW}.* <p>* Context window: 128,000 tokens. Max output tokens: 4,096 tokens.*/GPT_4_1106_PREVIEW("gpt-4-1106-preview"),/*** <b>GPT-4 Turbo Preview</b> is a high-intelligence GPT model usable in Chat* Completions.* <p>* Currently points to {@link #GPT_4_0125_PREVIEW}.* <p>* Context window: 128,000 tokens. Max output tokens: 4,096 tokens.*/GPT_4_TURBO_PREVIEW("gpt-4-turbo-preview"),/*** <b>GPT-4</b> is an older version of a high-intelligence GPT model, usable in* Chat Completions.* <p>* Currently points to {@link #GPT_4_0613}.* <p>* Context window: 8,192 tokens. Max output tokens: 8,192 tokens.*/GPT_4("gpt-4"),/*** GPT-4 model snapshot.* <p>* Context window: 8,192 tokens. Max output tokens: 8,192 tokens.*/GPT_4_0613("gpt-4-0613"),/*** GPT-4 model snapshot.* <p>* Context window: 8,192 tokens. Max output tokens: 8,192 tokens.*/GPT_4_0314("gpt-4-0314"),/*** <b>GPT-3.5 Turbo</b> models can understand and generate natural language or* code and have been optimized for chat using the Chat Completions API but work* well for non-chat tasks as well.* <p>* As of July 2024, {@link #GPT_4_O_MINI} should be used in place of* gpt-3.5-turbo, as it is cheaper, more capable, multimodal, and just as fast.* gpt-3.5-turbo is still available for use in the API.* <p>* <p>* Context window: 16,385 tokens. Max output tokens: 4,096 tokens. Knowledge* cutoff: September, 2021.*/GPT_3_5_TURBO("gpt-3.5-turbo"),/*** <b>GPT-3.5 Turbo Instruct</b> has similar capabilities to GPT-3 era models.* Compatible with the legacy Completions endpoint and not Chat Completions.* <p>* Context window: 4,096 tokens. Max output tokens: 4,096 tokens. Knowledge* cutoff: September, 2021.*/GPT_3_5_TURBO_INSTRUCT("gpt-3.5-turbo-instruct");public final String value;ChatModel(String value) {this.value = value;}public String getValue() {return this.value;}@Overridepublic String getName() {return this.value;}}/*** The reason the model stopped generating tokens.*/public enum ChatCompletionFinishReason {/*** The model hit a natural stop point or a provided stop sequence.*/@JsonProperty("stop")STOP,/*** The maximum number of tokens specified in the request was reached.*/@JsonProperty("length")LENGTH,/*** The content was omitted due to a flag from our content filters.*/@JsonProperty("content_filter")CONTENT_FILTER,/*** The model called a tool.*/@JsonProperty("tool_calls")TOOL_CALLS,/*** Only for compatibility with Mistral AI API.*/@JsonProperty("tool_call")TOOL_CALL}/*** OpenAI Embeddings Models:* <a href="https://platform.openai.com/docs/models/embeddings">Embeddings</a>.*/public enum EmbeddingModel {/*** Most capable embedding model for both english and non-english tasks. DIMENSION:* 3072*/TEXT_EMBEDDING_3_LARGE("text-embedding-3-large"),/*** Increased performance over 2nd generation ada embedding model. DIMENSION: 1536*/TEXT_EMBEDDING_3_SMALL("text-embedding-3-small"),/*** Most capable 2nd generation embedding model, replacing 16 first generation* models. DIMENSION: 1536*/TEXT_EMBEDDING_ADA_002("text-embedding-ada-002");public final String value;EmbeddingModel(String value) {this.value = value;}public String getValue() {return this.value;}}/*** Represents a tool the model may call. Currently, only functions are supported as a* tool.*/@JsonInclude(JsonInclude.Include.NON_NULL)public static class FunctionTool {/*** The type of the tool. Currently, only 'function' is supported.*/@JsonProperty("type")private Type type = Type.FUNCTION;/*** The function definition.*/@JsonProperty("function")private Function function;public FunctionTool() {}/*** Create a tool of type 'function' and the given function definition.* @param type the tool type* @param function function definition*/public FunctionTool(Type type, Function function) {this.type = type;this.function = function;}/*** Create a tool of type 'function' and the given function definition.* @param function function definition.*/public FunctionTool(Function function) {this(Type.FUNCTION, function);}public Type getType() {return this.type;}public Function getFunction() {return this.function;}public void setType(Type type) {this.type = type;}public void setFunction(Function function) {this.function = function;}/*** Create a tool of type 'function' and the given function definition.*/public enum Type {/*** Function tool type.*/@JsonProperty("function")FUNCTION}/*** Function definition.*/@JsonInclude(JsonInclude.Include.NON_NULL)public static class Function {@JsonProperty("description")private String description;@JsonProperty("name")private String name;@JsonProperty("parameters")private Map<String, Object> parameters;@JsonProperty("strict")Boolean strict;@JsonIgnoreprivate String jsonSchema;/*** NOTE: Required by Jackson, JSON deserialization!*/@SuppressWarnings("unused")private Function() {}/*** Create tool function definition.* @param description A description of what the function does, used by the* model to choose when and how to call the function.* @param name The name of the function to be called. Must be a-z, A-Z, 0-9,* or contain underscores and dashes, with a maximum length of 64.* @param parameters The parameters the functions accepts, described as a JSON* Schema object. To describe a function that accepts no parameters, provide* the value {"type": "object", "properties": {}}.* @param strict Whether to enable strict schema adherence when generating the* function call. If set to true, the model will follow the exact schema* defined in the parameters field. Only a subset of JSON Schema is supported* when strict is true.*/public Function(String description, String name, Map<String, Object> parameters, Boolean strict) {this.description = description;this.name = name;this.parameters = parameters;this.strict = strict;}/*** Create tool function definition.* @param description tool function description.* @param name tool function name.* @param jsonSchema tool function schema as json.*/public Function(String description, String name, String jsonSchema) {this(description, name, ModelOptionsUtils.jsonToMap(jsonSchema), null);}public String getDescription() {return this.description;}public String getName() {return this.name;}public Map<String, Object> getParameters() {return this.parameters;}public void setDescription(String description) {this.description = description;}public void setName(String name) {this.name = name;}public void setParameters(Map<String, Object> parameters) {this.parameters = parameters;}public Boolean getStrict() {return this.strict;}public void setStrict(Boolean strict) {this.strict = strict;}public String getJsonSchema() {return this.jsonSchema;}public void setJsonSchema(String jsonSchema) {this.jsonSchema = jsonSchema;if (jsonSchema != null) {this.parameters = ModelOptionsUtils.jsonToMap(jsonSchema);}}}}/*** The type of modality for the model completion.*/public enum OutputModality {// @formatter:off@JsonProperty("audio")AUDIO,@JsonProperty("text")TEXT// @formatter:on}/*** Creates a model response for the given chat conversation.** @param messages A list of messages comprising the conversation so far.* @param model ID of the model to use.* @param store Whether to store the output of this chat completion request for use in* OpenAI's model distillation or evals products.* @param metadata Developer-defined tags and values used for filtering completions in* the OpenAI's dashboard.* @param frequencyPenalty Number between -2.0 and 2.0. Positive values penalize new* tokens based on their existing frequency in the text so far, decreasing the model's* likelihood to repeat the same line verbatim.* @param logitBias Modify the likelihood of specified tokens appearing in the* completion. Accepts a JSON object that maps tokens (specified by their token ID in* the tokenizer) to an associated bias value from -100 to 100. Mathematically, the* bias is added to the logits generated by the model prior to sampling. The exact* effect will vary per model, but values between -1 and 1 should decrease or increase* likelihood of selection; values like -100 or 100 should result in a ban or* exclusive selection of the relevant token.* @param logprobs Whether to return log probabilities of the output tokens or not. If* true, returns the log probabilities of each output token returned in the 'content'* of 'message'.* @param topLogprobs An integer between 0 and 5 specifying the number of most likely* tokens to return at each token position, each with an associated log probability.* 'logprobs' must be set to 'true' if this parameter is used.* @param maxTokens The maximum number of tokens that can be generated in the chat* completion. This value can be used to control costs for text generated via API.* This value is now deprecated in favor of max_completion_tokens, and is not* compatible with o1 series models.* @param maxCompletionTokens An upper bound for the number of tokens that can be* generated for a completion, including visible output tokens and reasoning tokens.* @param n How many chat completion choices to generate for each input message. Note* that you will be charged based on the number of generated tokens across all the* choices. Keep n as 1 to minimize costs.* @param outputModalities Output types that you would like the model to generate for* this request. Most models are capable of generating text, which is the default:* ["text"]. The gpt-4o-audio-preview model can also be used to generate audio. To* request that this model generate both text and audio responses, you can use:* ["text", "audio"].* @param audioParameters Parameters for audio output. Required when audio output is* requested with outputModalities: ["audio"].* @param presencePenalty Number between -2.0 and 2.0. Positive values penalize new* tokens based on whether they appear in the text so far, increasing the model's* likelihood to talk about new topics.* @param responseFormat An object specifying the format that the model must output.* Setting to { "type": "json_object" } enables JSON mode, which guarantees the* message the model generates is valid JSON.* @param seed This feature is in Beta. If specified, our system will make a best* effort to sample deterministically, such that repeated requests with the same seed* and parameters should return the same result. Determinism is not guaranteed, and* you should refer to the system_fingerprint response parameter to monitor changes in* the backend.* @param serviceTier Specifies the latency tier to use for processing the request.* This parameter is relevant for customers subscribed to the scale tier service. When* this parameter is set, the response body will include the service_tier utilized.* @param stop Up to 4 sequences where the API will stop generating further tokens.* @param stream If set, partial message deltas will be sent.Tokens will be sent as* data-only server-sent events as they become available, with the stream terminated* by a data: [DONE] message.* @param streamOptions Options for streaming response. Only set this when you set.* @param temperature What sampling temperature to use, between 0 and 1. Higher values* like 0.8 will make the output more random, while lower values like 0.2 will make it* more focused and deterministic. We generally recommend altering this or top_p but* not both.* @param topP An alternative to sampling with temperature, called nucleus sampling,* where the model considers the results of the tokens with top_p probability mass. So* 0.1 means only the tokens comprising the top 10% probability mass are considered.* We generally recommend altering this or temperature but not both.* @param tools A list of tools the model may call. Currently, only functions are* supported as a tool. Use this to provide a list of functions the model may generate* JSON inputs for.* @param toolChoice Controls which (if any) function is called by the model. none* means the model will not call a function and instead generates a message. auto* means the model can pick between generating a message or calling a function.* Specifying a particular function via {"type: "function", "function": {"name":* "my_function"}} forces the model to call that function. none is the default when no* functions are present. auto is the default if functions are present. Use the* {@link ToolChoiceBuilder} to create the tool choice value.* @param user A unique identifier representing your end-user, which can help OpenAI* to monitor and detect abuse.* @param parallelToolCalls If set to true, the model will call all functions in the* tools list in parallel. Otherwise, the model will call the functions in the tools* list in the order they are provided.*/@JsonInclude(Include.NON_NULL)public record ChatCompletionRequest(// @formatter:off@JsonProperty("messages") List<ChatCompletionMessage> messages,@JsonProperty("model") String model,@JsonProperty("store") Boolean store,@JsonProperty("metadata") Map<String, String> metadata,@JsonProperty("frequency_penalty") Double frequencyPenalty,@JsonProperty("logit_bias") Map<String, Integer> logitBias,@JsonProperty("logprobs") Boolean logprobs,@JsonProperty("top_logprobs") Integer topLogprobs,@JsonProperty("max_tokens") @Deprecated Integer maxTokens, // Use maxCompletionTokens instead@JsonProperty("max_completion_tokens") Integer maxCompletionTokens,@JsonProperty("n") Integer n,@JsonProperty("modalities") List<OutputModality> outputModalities,@JsonProperty("audio") AudioParameters audioParameters,@JsonProperty("presence_penalty") Double presencePenalty,@JsonProperty("response_format") ResponseFormat responseFormat,@JsonProperty("seed") Integer seed,@JsonProperty("service_tier") String serviceTier,@JsonProperty("stop") List<String> stop,@JsonProperty("stream") Boolean stream,@JsonProperty("stream_options") StreamOptions streamOptions,@JsonProperty("temperature") Double temperature,@JsonProperty("top_p") Double topP,@JsonProperty("tools") List<FunctionTool> tools,@JsonProperty("tool_choice") Object toolChoice,@JsonProperty("parallel_tool_calls") Boolean parallelToolCalls,@JsonProperty("user") String user,@JsonProperty("reasoning_effort") String reasoningEffort) {/*** Shortcut constructor for a chat completion request with the given messages, model and temperature.** @param messages A list of messages comprising the conversation so far.* @param model ID of the model to use.* @param temperature What sampling temperature to use, between 0 and 1.*/public ChatCompletionRequest(List<ChatCompletionMessage> messages, String model, Double temperature) {this(messages, model, null, null, null, null, null, null, null, null, null, null, null, null, null,null, null, null, false, null, temperature, null,null, null, null, null, null);}/*** Shortcut constructor for a chat completion request with text and audio output.** @param messages A list of messages comprising the conversation so far.* @param model ID of the model to use.* @param audio Parameters for audio output. Required when audio output is requested with outputModalities: ["audio"].*/public ChatCompletionRequest(List<ChatCompletionMessage> messages, String model, AudioParameters audio, boolean stream) {this(messages, model, null, null, null, null, null, null,null, null, null, List.of(OutputModality.AUDIO, OutputModality.TEXT), audio, null, null,null, null, null, stream, null, null, null,null, null, null, null, null);}/*** Shortcut constructor for a chat completion request with the given messages, model, temperature and control for streaming.** @param messages A list of messages comprising the conversation so far.* @param model ID of the model to use.* @param temperature What sampling temperature to use, between 0 and 1.* @param stream If set, partial message deltas will be sent.Tokens will be sent as data-only server-sent events* as they become available, with the stream terminated by a data: [DONE] message.*/public ChatCompletionRequest(List<ChatCompletionMessage> messages, String model, Double temperature, boolean stream) {this(messages, model, null, null, null, null, null, null, null, null, null,null, null, null, null, null, null, null, stream, null, temperature, null,null, null, null, null, null);}/*** Shortcut constructor for a chat completion request with the given messages, model, tools and tool choice.* Streaming is set to false, temperature to 0.8 and all other parameters are null.** @param messages A list of messages comprising the conversation so far.* @param model ID of the model to use.* @param tools A list of tools the model may call. Currently, only functions are supported as a tool.* @param toolChoice Controls which (if any) function is called by the model.*/public ChatCompletionRequest(List<ChatCompletionMessage> messages, String model,List<FunctionTool> tools, Object toolChoice) {this(messages, model, null, null, null, null, null, null, null, null, null,null, null, null, null, null, null, null, false, null, 0.8, null,tools, toolChoice, null, null, null);}/*** Shortcut constructor for a chat completion request with the given messages for streaming.** @param messages A list of messages comprising the conversation so far.* @param stream If set, partial message deltas will be sent.Tokens will be sent as data-only server-sent events* as they become available, with the stream terminated by a data: [DONE] message.*/public ChatCompletionRequest(List<ChatCompletionMessage> messages, Boolean stream) {this(messages, null, null, null, null, null, null, null, null, null, null,null, null, null, null, null, null, null, stream, null, null, null,null, null, null, null, null);}/*** Sets the {@link StreamOptions} for this request.** @param streamOptions The new stream options to use.* @return A new {@link ChatCompletionRequest} with the specified stream options.*/public ChatCompletionRequest streamOptions(StreamOptions streamOptions) {return new ChatCompletionRequest(this.messages, this.model, this.store, this.metadata, this.frequencyPenalty, this.logitBias, this.logprobs,this.topLogprobs, this.maxTokens, this.maxCompletionTokens, this.n, this.outputModalities, this.audioParameters, this.presencePenalty,this.responseFormat, this.seed, this.serviceTier, this.stop, this.stream, streamOptions, this.temperature, this.topP,this.tools, this.toolChoice, this.parallelToolCalls, this.user, this.reasoningEffort);}/*** Helper factory that creates a tool_choice of type 'none', 'auto' or selected function by name.*/public static class ToolChoiceBuilder {/*** Model can pick between generating a message or calling a function.*/public static final String AUTO = "auto";/*** Model will not call a function and instead generates a message*/public static final String NONE = "none";/*** Specifying a particular function forces the model to call that function.*/public static Object FUNCTION(String functionName) {return Map.of("type", "function", "function", Map.of("name", functionName));}}/*** Parameters for audio output. Required when audio output is requested with outputModalities: ["audio"].* @param voice Specifies the voice type.* @param format Specifies the output audio format.*/@JsonInclude(Include.NON_NULL)public record AudioParameters(@JsonProperty("voice") Voice voice,@JsonProperty("format") AudioResponseFormat format) {/*** Specifies the voice type.*/public enum Voice {/** Alloy voice */@JsonProperty("alloy") ALLOY,/** Echo voice */@JsonProperty("echo") ECHO,/** Fable voice */@JsonProperty("fable") FABLE,/** Onyx voice */@JsonProperty("onyx") ONYX,/** Nova voice */@JsonProperty("nova") NOVA,/** Shimmer voice */@JsonProperty("shimmer") SHIMMER}/*** Specifies the output audio format.*/public enum AudioResponseFormat {/** MP3 format */@JsonProperty("mp3") MP3,/** FLAC format */@JsonProperty("flac") FLAC,/** OPUS format */@JsonProperty("opus") OPUS,/** PCM16 format */@JsonProperty("pcm16") PCM16,/** WAV format */@JsonProperty("wav") WAV}}/*** @param includeUsage If set, an additional chunk will be streamed* before the data: [DONE] message. The usage field on this chunk* shows the token usage statistics for the entire request, and* the choices field will always be an empty array. All other chunks* will also include a usage field, but with a null value.*/@JsonInclude(Include.NON_NULL)public record StreamOptions(@JsonProperty("include_usage") Boolean includeUsage) {public static StreamOptions INCLUDE_USAGE = new StreamOptions(true);}} // @formatter:on/*** Message comprising the conversation.** @param rawContent The contents of the message. Can be either a {@link MediaContent}* or a {@link String}. The response message content is always a {@link String}.* @param role The role of the messages author. Could be one of the {@link Role}* types.* @param name An optional name for the participant. Provides the model information to* differentiate between participants of the same role. In case of Function calling,* the name is the function name that the message is responding to.* @param toolCallId Tool call that this message is responding to. Only applicable for* the {@link Role#TOOL} role and null otherwise.* @param toolCalls The tool calls generated by the model, such as function calls.* Applicable only for {@link Role#ASSISTANT} role and null otherwise.* @param refusal The refusal message by the assistant. Applicable only for* {@link Role#ASSISTANT} role and null otherwise.* @param audioOutput Audio response from the model. >>>>>>> bdb66e577 (OpenAI -* Support audio input modality)*/@JsonInclude(Include.NON_NULL)public record ChatCompletionMessage(// @formatter:off@JsonProperty("content") Object rawContent,@JsonProperty("role") Role role,@JsonProperty("name") String name,@JsonProperty("tool_call_id") String toolCallId,@JsonProperty("tool_calls")@JsonFormat(with = JsonFormat.Feature.ACCEPT_SINGLE_VALUE_AS_ARRAY) List<ToolCall> toolCalls,@JsonProperty("refusal") String refusal,@JsonProperty("audio") AudioOutput audioOutput) { // @formatter:on/*** Create a chat completion message with the given content and role. All other* fields are null.* @param content The contents of the message.* @param role The role of the author of this message.*/public ChatCompletionMessage(Object content, Role role) {this(content, role, null, null, null, null, null);}/*** Get message content as String.*/public String content() {if (this.rawContent == null) {return null;}if (this.rawContent instanceof String text) {return text;}throw new IllegalStateException("The content is not a string!");}/*** The role of the author of this message.*/public enum Role {/*** System message.*/@JsonProperty("system")SYSTEM,/*** User message.*/@JsonProperty("user")USER,/*** Assistant message.*/@JsonProperty("assistant")ASSISTANT,/*** Tool message.*/@JsonProperty("tool")TOOL}/*** An array of content parts with a defined type. Each MediaContent can be of* either "text", "image_url", or "input_audio" type. Only one option allowed.** @param type Content type, each can be of type text or image_url.* @param text The text content of the message.* @param imageUrl The image content of the message. You can pass multiple images* by adding multiple image_url content parts. Image input is only supported when* using the gpt-4-visual-preview model.* @param inputAudio Audio content part.*/@JsonInclude(Include.NON_NULL)public record MediaContent(// @formatter:off@JsonProperty("type") String type,@JsonProperty("text") String text,@JsonProperty("image_url") ImageUrl imageUrl,@JsonProperty("input_audio") InputAudio inputAudio) { // @formatter:on/*** Shortcut constructor for a text content.* @param text The text content of the message.*/public MediaContent(String text) {this("text", text, null, null);}/*** Shortcut constructor for an image content.* @param imageUrl The image content of the message.*/public MediaContent(ImageUrl imageUrl) {this("image_url", null, imageUrl, null);}/*** Shortcut constructor for an audio content.* @param inputAudio The audio content of the message.*/public MediaContent(InputAudio inputAudio) {this("input_audio", null, null, inputAudio);}/*** @param data Base64 encoded audio data.* @param format The format of the encoded audio data. Currently supports* "wav" and "mp3".*/@JsonInclude(Include.NON_NULL)public record InputAudio(// @formatter:off@JsonProperty("data") String data,@JsonProperty("format") Format format) {public enum Format {/** MP3 audio format */@JsonProperty("mp3") MP3,/** WAV audio format */@JsonProperty("wav") WAV} // @formatter:on}/*** Shortcut constructor for an image content.** @param url Either a URL of the image or the base64 encoded image data. The* base64 encoded image data must have a special prefix in the following* format: "data:{mimetype};base64,{base64-encoded-image-data}".* @param detail Specifies the detail level of the image.*/@JsonInclude(Include.NON_NULL)public record ImageUrl(@JsonProperty("url") String url, @JsonProperty("detail") String detail) {public ImageUrl(String url) {this(url, null);}}}/*** The relevant tool call.** @param index The index of the tool call in the list of tool calls. Required in* case of streaming.* @param id The ID of the tool call. This ID must be referenced when you submit* the tool outputs in using the Submit tool outputs to run endpoint.* @param type The type of tool call the output is required for. For now, this is* always function.* @param function The function definition.*/@JsonInclude(Include.NON_NULL)public record ToolCall(// @formatter:off@JsonProperty("index") Integer index,@JsonProperty("id") String id,@JsonProperty("type") String type,@JsonProperty("function") ChatCompletionFunction function) { // @formatter:onpublic ToolCall(String id, String type, ChatCompletionFunction function) {this(null, id, type, function);}}/*** The function definition.** @param name The name of the function.* @param arguments The arguments that the model expects you to pass to the* function.*/@JsonInclude(Include.NON_NULL)public record ChatCompletionFunction(// @formatter:off@JsonProperty("name") String name,@JsonProperty("arguments") String arguments) { // @formatter:on}/*** Audio response from the model.** @param id Unique identifier for the audio response from the model.* @param data Audio output from the model.* @param expiresAt When the audio content will no longer be available on the* server.* @param transcript Transcript of the audio output from the model.*/@JsonInclude(Include.NON_NULL)public record AudioOutput(// @formatter:off@JsonProperty("id") String id,@JsonProperty("data") String data,@JsonProperty("expires_at") Long expiresAt,@JsonProperty("transcript") String transcript) { // @formatter:on}}/*** Represents a chat completion response returned by model, based on the provided* input.** @param id A unique identifier for the chat completion.* @param choices A list of chat completion choices. Can be more than one if n is* greater than 1.* @param created The Unix timestamp (in seconds) of when the chat completion was* created.* @param model The model used for the chat completion.* @param serviceTier The service tier used for processing the request. This field is* only included if the service_tier parameter is specified in the request.* @param systemFingerprint This fingerprint represents the backend configuration that* the model runs with. Can be used in conjunction with the seed request parameter to* understand when backend changes have been made that might impact determinism.* @param object The object type, which is always chat.completion.* @param usage Usage statistics for the completion request.*/@JsonInclude(Include.NON_NULL)public record ChatCompletion(// @formatter:off@JsonProperty("id") String id,@JsonProperty("choices") List<Choice> choices,@JsonProperty("created") Long created,@JsonProperty("model") String model,@JsonProperty("service_tier") String serviceTier,@JsonProperty("system_fingerprint") String systemFingerprint,@JsonProperty("object") String object,@JsonProperty("usage") Usage usage) { // @formatter:on/*** Chat completion choice.** @param finishReason The reason the model stopped generating tokens.* @param index The index of the choice in the list of choices.* @param message A chat completion message generated by the model.* @param logprobs Log probability information for the choice.*/@JsonInclude(Include.NON_NULL)public record Choice(// @formatter:off@JsonProperty("finish_reason") ChatCompletionFinishReason finishReason,@JsonProperty("index") Integer index,@JsonProperty("message") ChatCompletionMessage message,@JsonProperty("logprobs") LogProbs logprobs) { // @formatter:on}}/*** Log probability information for the choice.** @param content A list of message content tokens with log probability information.* @param refusal A list of message refusal tokens with log probability information.*/@JsonInclude(Include.NON_NULL)public record LogProbs(@JsonProperty("content") List<Content> content,@JsonProperty("refusal") List<Content> refusal) {/*** Message content tokens with log probability information.** @param token The token.* @param logprob The log probability of the token.* @param probBytes A list of integers representing the UTF-8 bytes representation* of the token. Useful in instances where characters are represented by multiple* tokens and their byte representations must be combined to generate the correct* text representation. Can be null if there is no bytes representation for the* token.* @param topLogprobs List of the most likely tokens and their log probability, at* this token position. In rare cases, there may be fewer than the number of* requested top_logprobs returned.*/@JsonInclude(Include.NON_NULL)public record Content(// @formatter:off@JsonProperty("token") String token,@JsonProperty("logprob") Float logprob,@JsonProperty("bytes") List<Integer> probBytes,@JsonProperty("top_logprobs") List<TopLogProbs> topLogprobs) { // @formatter:on/*** The most likely tokens and their log probability, at this token position.** @param token The token.* @param logprob The log probability of the token.* @param probBytes A list of integers representing the UTF-8 bytes* representation of the token. Useful in instances where characters are* represented by multiple tokens and their byte representations must be* combined to generate the correct text representation. Can be null if there* is no bytes representation for the token.*/@JsonInclude(Include.NON_NULL)public record TopLogProbs(// @formatter:off@JsonProperty("token") String token,@JsonProperty("logprob") Float logprob,@JsonProperty("bytes") List<Integer> probBytes) { // @formatter:on}}}// Embeddings API/*** Usage statistics for the completion request.** @param completionTokens Number of tokens in the generated completion. Only* applicable for completion requests.* @param promptTokens Number of tokens in the prompt.* @param totalTokens Total number of tokens used in the request (prompt +* completion).* @param promptTokensDetails Breakdown of tokens used in the prompt.* @param completionTokenDetails Breakdown of tokens used in a completion.* @param promptCacheHitTokens Number of tokens in the prompt that were served from* (util for* <a href="https://api-docs.deepseek.com/api/create-chat-completion">DeepSeek</a>* support).* @param promptCacheMissTokens Number of tokens in the prompt that were not served* (util for* <a href="https://api-docs.deepseek.com/api/create-chat-completion">DeepSeek</a>* support).*/@JsonInclude(Include.NON_NULL)@JsonIgnoreProperties(ignoreUnknown = true)public record Usage(// @formatter:off@JsonProperty("completion_tokens") Integer completionTokens,@JsonProperty("prompt_tokens") Integer promptTokens,@JsonProperty("total_tokens") Integer totalTokens,@JsonProperty("prompt_tokens_details") PromptTokensDetails promptTokensDetails,@JsonProperty("completion_tokens_details") CompletionTokenDetails completionTokenDetails,@JsonProperty("prompt_cache_hit_tokens") Integer promptCacheHitTokens,@JsonProperty("prompt_cache_miss_tokens") Integer promptCacheMissTokens) { // @formatter:onpublic Usage(Integer completionTokens, Integer promptTokens, Integer totalTokens) {this(completionTokens, promptTokens, totalTokens, null, null, null, null);}/*** Breakdown of tokens used in the prompt** @param audioTokens Audio input tokens present in the prompt.* @param cachedTokens Cached tokens present in the prompt.*/@JsonInclude(Include.NON_NULL)public record PromptTokensDetails(// @formatter:off@JsonProperty("audio_tokens") Integer audioTokens,@JsonProperty("cached_tokens") Integer cachedTokens) { // @formatter:on}/*** Breakdown of tokens used in a completion.** @param reasoningTokens Number of tokens generated by the model for reasoning.* @param acceptedPredictionTokens Number of tokens generated by the model for* accepted predictions.* @param audioTokens Number of tokens generated by the model for audio.* @param rejectedPredictionTokens Number of tokens generated by the model for* rejected predictions.*/@JsonInclude(Include.NON_NULL)@JsonIgnoreProperties(ignoreUnknown = true)public record CompletionTokenDetails(// @formatter:off@JsonProperty("reasoning_tokens") Integer reasoningTokens,@JsonProperty("accepted_prediction_tokens") Integer acceptedPredictionTokens,@JsonProperty("audio_tokens") Integer audioTokens,@JsonProperty("rejected_prediction_tokens") Integer rejectedPredictionTokens) { // @formatter:on}}/*** Represents a streamed chunk of a chat completion response returned by model, based* on the provided input.** @param id A unique identifier for the chat completion. Each chunk has the same ID.* @param choices A list of chat completion choices. Can be more than one if n is* greater than 1.* @param created The Unix timestamp (in seconds) of when the chat completion was* created. Each chunk has the same timestamp.* @param model The model used for the chat completion.* @param serviceTier The service tier used for processing the request. This field is* only included if the service_tier parameter is specified in the request.* @param systemFingerprint This fingerprint represents the backend configuration that* the model runs with. Can be used in conjunction with the seed request parameter to* understand when backend changes have been made that might impact determinism.* @param object The object type, which is always 'chat.completion.chunk'.* @param usage Usage statistics for the completion request. Present in the last chunk* only if the StreamOptions.includeUsage is set to true.*/@JsonInclude(Include.NON_NULL)public record ChatCompletionChunk(// @formatter:off@JsonProperty("id") String id,@JsonProperty("choices") List<ChunkChoice> choices,@JsonProperty("created") Long created,@JsonProperty("model") String model,@JsonProperty("service_tier") String serviceTier,@JsonProperty("system_fingerprint") String systemFingerprint,@JsonProperty("object") String object,@JsonProperty("usage") Usage usage) { // @formatter:on/*** Chat completion choice.** @param finishReason The reason the model stopped generating tokens.* @param index The index of the choice in the list of choices.* @param delta A chat completion delta generated by streamed model responses.* @param logprobs Log probability information for the choice.*/@JsonInclude(Include.NON_NULL)public record ChunkChoice(// @formatter:off@JsonProperty("finish_reason") ChatCompletionFinishReason finishReason,@JsonProperty("index") Integer index,@JsonProperty("delta") ChatCompletionMessage delta,@JsonProperty("logprobs") LogProbs logprobs) { // @formatter:on}}/*** Represents an embedding vector returned by embedding endpoint.** @param index The index of the embedding in the list of embeddings.* @param embedding The embedding vector, which is a list of floats. The length of* vector depends on the model.* @param object The object type, which is always 'embedding'.*/@JsonInclude(Include.NON_NULL)public record Embedding(// @formatter:off@JsonProperty("index") Integer index,@JsonProperty("embedding") float[] embedding,@JsonProperty("object") String object) { // @formatter:on/*** Create an embedding with the given index, embedding and object type set to* 'embedding'.* @param index The index of the embedding in the list of embeddings.* @param embedding The embedding vector, which is a list of floats. The length of* vector depends on the model.*/public Embedding(Integer index, float[] embedding) {this(index, embedding, "embedding");}}/*** Creates an embedding vector representing the input text.** @param <T> Type of the input.* @param input Input text to embed, encoded as a string or array of tokens. To embed* multiple inputs in a single request, pass an array of strings or array of token* arrays. The input must not exceed the max input tokens for the model (8192 tokens* for text-embedding-ada-002), cannot be an empty string, and any array must be 2048* dimensions or less.* @param model ID of the model to use.* @param encodingFormat The format to return the embeddings in. Can be either float* or base64.* @param dimensions The number of dimensions the resulting output embeddings should* have. Only supported in text-embedding-3 and later models.* @param user A unique identifier representing your end-user, which can help OpenAI* to monitor and detect abuse.*/@JsonInclude(Include.NON_NULL)public record EmbeddingRequest<T>(// @formatter:off@JsonProperty("input") T input,@JsonProperty("model") String model,@JsonProperty("encoding_format") String encodingFormat,@JsonProperty("dimensions") Integer dimensions,@JsonProperty("user") String user) { // @formatter:on/*** Create an embedding request with the given input, model and encoding format set* to float.* @param input Input text to embed.* @param model ID of the model to use.*/public EmbeddingRequest(T input, String model) {this(input, model, "float", null, null);}/*** Create an embedding request with the given input. Encoding format is set to* float and user is null and the model is set to 'text-embedding-ada-002'.* @param input Input text to embed.*/public EmbeddingRequest(T input) {this(input, DEFAULT_EMBEDDING_MODEL);}}/*** List of multiple embedding responses.** @param <T> Type of the entities in the data list.* @param object Must have value "list".* @param data List of entities.* @param model ID of the model to use.* @param usage Usage statistics for the completion request.*/@JsonInclude(Include.NON_NULL)public record EmbeddingList<T>(// @formatter:off@JsonProperty("object") String object,@JsonProperty("data") List<T> data,@JsonProperty("model") String model,@JsonProperty("usage") Usage usage) { // @formatter:on}public static class Builder {private String baseUrl = OpenAiApiConstants.DEFAULT_BASE_URL;private ApiKey apiKey;private MultiValueMap<String, String> headers = new LinkedMultiValueMap<>();private String completionsPath = "/v1/chat/completions";private String embeddingsPath = "/v1/embeddings";private RestClient.Builder restClientBuilder = RestClient.builder();private WebClient.Builder webClientBuilder = WebClient.builder();private ResponseErrorHandler responseErrorHandler = RetryUtils.DEFAULT_RESPONSE_ERROR_HANDLER;public Builder baseUrl(String baseUrl) {Assert.hasText(baseUrl, "baseUrl cannot be null or empty");this.baseUrl = baseUrl;return this;}public Builder apiKey(ApiKey apiKey) {Assert.notNull(apiKey, "apiKey cannot be null");this.apiKey = apiKey;return this;}public Builder apiKey(String simpleApiKey) {Assert.notNull(simpleApiKey, "simpleApiKey cannot be null");this.apiKey = new SimpleApiKey(simpleApiKey);return this;}public Builder headers(MultiValueMap<String, String> headers) {Assert.notNull(headers, "headers cannot be null");this.headers = headers;return this;}public Builder completionsPath(String completionsPath) {Assert.hasText(completionsPath, "completionsPath cannot be null or empty");this.completionsPath = completionsPath;return this;}public Builder embeddingsPath(String embeddingsPath) {Assert.hasText(embeddingsPath, "embeddingsPath cannot be null or empty");this.embeddingsPath = embeddingsPath;return this;}public Builder restClientBuilder(RestClient.Builder restClientBuilder) {Assert.notNull(restClientBuilder, "restClientBuilder cannot be null");this.restClientBuilder = restClientBuilder;return this;}public Builder webClientBuilder(WebClient.Builder webClientBuilder) {Assert.notNull(webClientBuilder, "webClientBuilder cannot be null");this.webClientBuilder = webClientBuilder;return this;}public Builder responseErrorHandler(ResponseErrorHandler responseErrorHandler) {Assert.notNull(responseErrorHandler, "responseErrorHandler cannot be null");this.responseErrorHandler = responseErrorHandler;return this;}public OpenAiApi build() {Assert.notNull(this.apiKey, "apiKey must be set");return new OpenAiApi(this.baseUrl, this.apiKey, this.headers, this.completionsPath, this.embeddingsPath,this.restClientBuilder, this.webClientBuilder, this.responseErrorHandler);}}}

主要修改这个方法

public OpenAiApi(String baseUrl, ApiKey apiKey, MultiValueMap<String, String> headers, String completionsPath,String embeddingsPath, RestClient.Builder restClientBuilder, WebClient.Builder webClientBuilder,ResponseErrorHandler responseErrorHandler)

spring-ai-openai的pom文件添加以下依赖

		<!-- production dependencies --><dependency><groupId>com.squareup.okhttp3</groupId><artifactId>okhttp</artifactId><version>4.12.0</version></dependency><dependency><groupId>io.projectreactor.netty</groupId><artifactId>reactor-netty</artifactId><version>1.3.0-M1</version></dependency>

然后mvn 编译安装

spring-ai-openai 使用如下

<dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-openai-spring-boot-starter</artifactId><exclusions><exclusion><groupId>org.springframework.ai</groupId><artifactId>spring-ai-openai</artifactId></exclusion></exclusions></dependency><dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-openai</artifactId><version>1.0.0-M6-XIN</version></dependency><dependency><groupId>com.squareup.okhttp3</groupId><artifactId>okhttp</artifactId><version>4.12.0</version></dependency><dependency><groupId>io.projectreactor.netty</groupId><artifactId>reactor-netty</artifactId><version>1.3.0-M1</version></dependency>

spring-ai-openai调用Xinference1.4.1报错

1、Xinference 报错logs 此处是调用 /v1/chat/completions 接口 2025-04-06 15:48:51 xinference | return await dependant.call(**values) 2025-04-06 15:48:51 xinference | File "/usr/local/lib/python3.10/dist-packages/xinference/api/restful_api.py", …...

编程日记 2025/8/22 18:31:43

XC7K160T-2FFG676I Kintex‑7系列 Xilinx 赛灵思 FPGA 详细技术规格

XC7K160T-1FFG676I XC7K160T-1FFG676C XC7K160T-2FFG676C 1. 基本概述 XC7K160T-2FFG676I 属于 Xilinx Kintex‑7 系列 FPGA，该系列芯片采用 28nm （HKMG）工艺制造，旨在提供高性能与低功耗的平衡。该芯片主要面向对高速数据处理、…...

编程日记 2025/8/22 10:36:06

C++学习之udp通信

1.UDP特点 c /* udp 传输层协议, 和tcp是一样的特点: 面向无连接的, 不安全的, 报式传输协议 1. 无连接: udp通信的时候不需要connect 1). 通信不需要建立连接 2). 如果想给对方发送数据, 只需要指定对方的IP和端口 2. udp会丢包 1). 数…...

编程日记 2025/8/21 15:01:03

2020年-全国大学生数学建模竞赛(CUMCM)试题速浏、分类及浅析

2020年-全国大学生数学建模竞赛(CUMCM)试题速浏、分类及浅析全国大学生数学建模竞赛（China Undergraduate Mathematical Contest in Modeling）是国家教委高教司和中国工业与应用数学学会共同主办的面向全国大学生的群众性科技活动，目的在于激励学生学习数学的积极性，提高学…...

编程日记 2025/8/22 16:59:46

【数据标准】数据标准化实施流程与方法-保障机制篇

导读：1、数据标准化保障机制（组织架构、协作流程）是战略落地的基石，确保责权分明与资源协同；2、数据标准化制度建设（政策、标准、工具）构建了统一治理框架，规范数据…...

编程日记 2025/8/22 19:25:52

ZLMediaKit部署与配置

ZLMediaKit编译 # 安装编译器 sudo apt install build-essential cmake# 其它依赖库 sudo apt-get install libssl-dev libsdl-dev libavcodec-dev libavutil-dev ffmpeg git cd /usr/local/srcgit clone --depth 1 https://gitee.com/xia-chu/ZLMediaKit.git cd ZLMediaKit# …...

编程日记 2025/8/22 10:52:25

38、web前端开发之Vue3保姆教程(二)

三、Vue3语法详解 1、组件 1 什么是组件？组件是 Vue.js 中最重要的概念之一。它是一种可复用的 Vue 实例，允许我们将 UI 拆分为独立的、可复用的部分。组件可以提高代码的组织性和可维护性。 2 创建组件在 Vue 3 中，组件通常使用单文件组件（SFC）编写，其包含三个主…...

编程日记 2025/8/22 19:07:24

知识中台如何重构企业信息生态？关键要素解析

在信息化快速发展的时代，企业面临着如何高效整合和管理知识资源的挑战。知识中台作为企业信息管理的核心工具，正在帮助企业提升运营效率和创新力。本文将探讨知识中台如何重构企业信息生态，并解析其关键要素。一、什么是知识中台&#xff1f…...

编程日记 2025/8/21 18:29:01

蓝桥杯python组备赛(记录个人模板)

文章目录栈队列堆递归装饰器并查集树状数组线段树最近公共祖先LCAST表字典树KMPmanacher跳表(代替C STL的set)dijkstra总结栈用list代替队列用deque双端队列替代堆用heapq 递归装饰器众所周知，python的递归深度只有1000，根本满足不了大部…...

编程日记 2025/8/21 12:29:24

C++的多态 - 下

目录多态的原理虚函数表 1.计算包含虚函数类的大小 2.虚函数表介绍多态底层原理 1.父类引用调用 2.父类指针调用 3.动态绑定与静态绑定单继承和多继承关系的虚函数表函数指针 1.函数指针变量 (1)函数指针变量创建 (2)函数指针变量的使用 (3)两段有趣的代码 …...

编程日记 2025/8/22 19:26:33

XSS（跨站脚本攻击）

什么是 XSS 攻击？ XSS 攻击（Cross-Site Scripting）是一种常见的网络攻击手段，攻击者通过在网站上注入恶意的 JavaScript 代码，让网站在用户的浏览器中执行这些恶意代码，进而达到窃取信息、篡改网页内容或…...

编程日记 2025/8/21 14:15:04

LLM Agents的历史、现状与未来趋势

引言大型语言模型（Large Language Model, LLM）近年在人工智能领域掀起革命，它们具备了出色的语言理解与生成能力。然而，单纯的LLM更像是被动的“回答者”，只能根据输入给出回复。为了让LLM真正“行动”起来&#xff…...

编程日记 2025/8/22 10:23:30

最简rnn_lstm模型python源码

1.源码 GitCode - 全球开发者的开源社区,开源代码托管平台不到120行代码，参考了《深度学习与交通大数据实战》3.2节。注意这本书只能在京东等在线商城网购，才能拿到相应的数据集和源码。我的是在当地新华书店买的——买清华出版社，记得这个…...

编程日记 2025/8/22 19:26:32

基于Android的图书借阅和占座系统(源码+lw+部署文档+讲解)，源码可白嫖!

摘要基于Android的图书借阅和占座系统设计的目的是为用户提供图书信息、图书馆、图书资讯等内容，用户可以进行图书借阅、预约选座等操作。与PC端应用程序相比，图书借阅和占座系统的设计主要面向于广大用户，旨在为用户提供一个图书借阅及占…...

编程日记 2025/8/21 7:24:38

vue3+element-plus动态与静态表格数据渲染

一、表格组件： <template> <el-table ref"myTable" :data"tableData" :header-cell-style"headerCellStyle" header-row-class-name"my-table-header" cell-class-name"my-td-cell" :row-style"r…...

编程日记 2025/8/21 10:54:05

数据库50个练习

数据表介绍 --1.学生表 Student(SId,Sname,Sage,Ssex) --SId 学生编号,Sname 学生姓名,Sage 出生年月,Ssex 学生性别 --2.课程表 Course(CId,Cname,TId) --CId 课程编号,Cname 课程名称,TId 教师编号 --3.教师表 Teacher(TId,Tname) --TId 教师编号,Tname 教师姓名 --4.成绩…...

编程日记 2025/8/22 16:43:18

Open CASCADE学习|读取点集拟合样条曲线（续）

问题上一篇文章已经实现了样条曲线拟合，但是仍存在问题，Tolerance过大拟合成直线了，Tolerance过大头尾波浪形。正确改进方案 1️⃣ 核心参数优化通过调整以下参数控制曲线平滑度： Standard_Integer DegMin 3; // 最低阶…...

编程日记 2025/8/21 19:02:31

HTML基础教程：创建双十一购物狂欢节网页

页面概况： 在这篇技术博客中，我将详细讲解如何使用HTML基础标签创建一个简单而美观的双十一购物狂欢节主题网页。我们将逐步分析代码结构，了解每个HTML元素的作用，以及如何通过HTML属性控制页面布局和样式。页面整体结构首先&…...

编程日记 2025/8/21 21:50:49

ES6 新增特性箭头函数

简述： ECMAScript 6（简称ES6）是于2015年6月正式发布的JavaScript语言的标准，正式名为ECMAScript 2015（ES2015）。它的目标是使得JavaScript语言可以用来编写复杂的大型应用程序，成为企业级开发语…...

编程日记 2025/8/22 19:23:26

【C++算法】49.分治_归并_计算右侧小于当前元素的个数

文章目录题目链接：题目描述：解法C 算法代码：图解题目链接： 315. 计算右侧小于当前元素的个数题目描述： 解法归并排序（分治） 当前元素的后面，有多少个比我小。（降序&…...

编程日记 2025/8/21 9:19:14

Multi-class N-pair Loss论文理解

一、N-pair loss 对比 Triplet loss 对于N-pair loss来说，当N2时，与triplet loss是很相似的。对anchor-positive pair，都只有一个negative sample。而且，N-pair loss（N2时）为triplet loss的平滑近似Softpl…...

编程日记 2025/8/22 8:34:05

uniapp微信小程序地图marker自定义气泡 customCallout偶尔显示不全解决办法

这个天坑问题，在微信开发工具上是不会显示出来的,只有在真机上才会偶尔出现随机样式偏移/裁剪/宽长偏移，询问社区也只是让你提交代码片段，并无解决办法。一开始我怀疑是地图组件加载出现了问题，于是给地图加了一个v-if"reL…...

编程日记 2025/8/22 14:23:39

蓝桥杯嵌入式总结

1.lcd显示和led引脚冲突在lcd使用到的函数中加入两行代码 uint16_t temp GPIOC->ODR; GPIOC->ODR temp; 2.关于PA15,PB4pwm波输入捕获首先pwm输入捕获中断使用 HAL_TIM_IC_Start_IT(&htim2,TIM_CHANNEL_1); 再在输入捕获中断回调函数中使用 void HAL…...

编程日记 2025/8/21 21:40:05

C#的反射机制

C#反射机制详解什么是反射？ 反射(Reflection)是C#中的一项强大功能，它允许程序在运行时动态获取类型信息、访问和操作对象成员。简单来说，反射使程序可以在不预先知道类型的情况下，查看、使用和修改程序集中的代码。常见反射…...

编程日记 2025/8/22 1:47:11

Java并发编程高频面试题

一、基础概念 1. 并行与并发的区别？ 并行：多个任务在多个CPU核心上同时执行（物理上同时）。并发：多个任务在单CPU核心上交替执行（逻辑上同时）。类比：并行是多个窗口同时服务&#x…...

编程日记 2025/8/22 6:02:41

Invalid bound statement (not found)

前言： 通过实践而发现真理，又通过实践而证实真理和发展真理。从感性认识而能动地发展到理性认识，又从理性认识而能动地指导革命实践，改造主观世界和客观世界。实践、认识、再实践、再认识，这种形式，循环往…...

编程日记 2025/8/21 3:24:10

【Vue-路由】学习笔记

目录 <<回到导览路由1.单页应用和多页面2.路由基本使用2.1.路由的含义2.2.VueRouter插件2.3.配置路由规则和导航2.4.组件目录存放2.5.路由模块封装 3.rounter3.1.router-link实现高亮3.2.自定义匹配类名3.3.声明式导航3.3.1.查询参数传参3.3.2.动态路由传参3.3.3.总结 3.…...

编程日记 2025/8/21 19:24:56

前端服务配置详解：从入门到实战

前端服务配置详解：从入门到实战一、环境配置文件（.env） 1.1 基础结构在项目根目录创建 .env 文件： # 开发环境 VUE_APP_API_BASE_URL http://localhost:3000/api VUE_APP_VERSION 1.0.0# 生产环境（.env.produc…...

编程日记 2025/8/22 10:03:28

Java安全管理器 - SecurityManager

什么是Java安全管理器？ Java安全管理器是Java提供的保护JVM和程序安全的机制，它能限制用户的代码对文件、内存、资源、网络的操作和访问，防止恶意代码入侵程序。常用来控制用户提交的代码对各种资源的访问权限，防止用户恶意提交代…...

编程日记 2025/8/19 13:26:44

Arrays操作工具 Lambda表达式集合迭代器数据结构泛型 set集合 list集合

Arrays操作工具自己定义的排序规则简单理解如果是：o1 - o2 升序排列 o2 - o1 降序排列 Lambda表达式函数式编程函数式编程（Functional programming）是一种思想特点。面向对象：先去找对象，让对象做事情。。函数式…...

编程日记 2025/8/19 1:39:01

ORM、Mybatis和Hibernate、Mybatis使用教程、parameterType、resultType、级联查询案例、resultMap映射

DAY21.1 Java核心基础 ORM Object Relationship Mapping 对象关系映射面向对象的程序到—关系型数据库的映射比如java – MySQL的映射 ORM框架就是实现这个映射的框架 Hibernate、Mybatis、MybatisPlus、Spring Data JPA、Spring JDBC Spring Data JPA的底层就是Hiber…...

编程日记 2025/8/22 12:04:31

《Java八股文の文艺复兴》第十一篇：量子永生架构——对象池的混沌边缘（终极试炼·完全体）

Tags: - Java高并发 - 量子架构 - 混沌工程 - 赛博修真 - 三体防御目录： 卷首语：蝴蝶振翅引发的量子海啸第一章：混沌初开——对象池的量子涅槃（深度扩展） 第二章：混沌计算——对象复活的降维打击&…...

编程日记 2025/8/21 10:36:30

蓝桥杯备赛---真题训练之15届蓝桥杯找回连接之旅

题目介绍在网络世界中，突然间失去了所有的连接。作为勇敢的冒险者，你将踏上一段惊险刺激的旅程，穿越充满谜题和挑战的网络景观，与神秘的网络幽灵对抗，解开断网之谜，找回失去的连接，带领人们重…...

编程日记 2025/8/22 4:55:57

PowerApps MDA-模版-文档模版无法下载和上传Word模版

Power Apps的高级设置-模版中，文档模版目前只能看到新建和上传Excel模版，看不到Word模版这是一个已知bug, 什么时候能修复不好说，解决办法也很简单，先上传一个Excel模版，随便任何一个实体就行，为的是视图列…...

编程日记 2025/8/22 8:15:27

全国大学生数学建模竞赛赛题深度分析报告（2010-2024）

全国大学生数学建模竞赛赛题深度分析报告（2010-2024） 全国大学生数学建模竞赛(CUMCM)是中国最具影响力的大学生科技竞赛之一，本报告将对2010-2024年间的赛题进行全面统计分析，包括题目类型、领域分布、模型方法等多个维度&#x…...

编程日记 2025/8/22 19:17:02

职坐标解析自动驾驶技术发展新趋势

内容概要作为智能交通革命的核心驱动力，自动驾驶技术正以惊人的速度重塑出行生态。2023年，行业在多传感器融合与AI算法优化两大领域实现突破性进展：激光雷达、摄像头与毫米波雷达的协同精度提升至厘米级，而深度学习模型的实时决…...

编程日记 2025/8/21 12:53:38

快速入手-前后端分离Python权限系统基于Django5+DRF+Vue3.2+Element Plus+Jwt

引用：打造前后端分离Python权限系统基于Django5DRFVue3.2Element PlusJwt 视频教程 （火爆连载更新中..）_哔哩哔哩_bibili 说明：1、结合个人DRF基础和该视频去根据自己的项目进行开发。 2、引用该视频中作者的思路去升华自身的项…...

编程日记 2025/8/22 14:16:34

HTTP 协议详解

HTTP 协议 HTTP（HyperText Transfer Protocol，超文本传输协议）是互联网上应用最广泛的协议之一，用于在客户端（如浏览器）和服务器之间传输超文本（如网页）。 HTTP 是万维网&#xff…...

编程日记 2025/8/19 1:42:23

巧记英语四级单词 Unit1-4【晓艳老师版】

tain—take拿着、sus 下面，只有sur表示上面、ob表示方向、de往下，分开 retain v.保持 re-重复，tain—take拿着，重复的拿着maintain v. 维持，维修，保养 main主要的，主要的东西都拿着的那个人维…...

编程日记 2025/8/22 6:34:33

Transformers without Normalization论文翻译

论文信息： 作者：Jiachen Zhu, Xinlei Chen, Kaiming He, Yann LeCun, Zhuang Liu 论文地址：arxiv.org/pdf/2503.10622 代码仓库：jiachenzhu/DyT: Code release for DynamicTanh (DyT) 摘要归一化层在现代神经网络中无处不在…...

编程日记 2025/8/22 1:29:18

Ollama

目录定义与核心功能应用场景Ollama与Llama的关系安装与使用 Ollama是一个开源的本地大语言模型（LLM）运行框架，专为在本地机器上便捷部署和运行大型语言模型而设计。以下是关于Ollama的全面介绍： 定义与核心功能多种预训练语言模…...

编程日记 2025/8/22 12:09:21

社交app圈子模块0到1实现

一、逻辑分析用户相关用户需要能够创建圈子，这涉及到用户身份验证，确保只有注册用户可以进行创建操作。每个圈子有创建者，创建者对圈子有一定的管理权限，如设置圈子规则、邀请成员等。圈子信息圈子需要有名称、简介、头像等基…...

编程日记 2025/8/21 13:29:52

OpenCV--图像边缘检测

在计算机视觉和图像处理领域，边缘检测是极为关键的技术。边缘作为图像中像素值发生急剧变化的区域，承载了图像的重要结构信息，在物体识别、图像分割、目标跟踪等众多应用场景中发挥着核心作用。OpenCV 作为强大的计算机视觉库，提供…...

编程日记 2025/8/20 20:18:25

批量压缩 jpg/png 等格式照片|批量调整图片的宽高尺寸

图片格式种类非常的多，并且不同的图片由于像素、尺寸不一样，可能占用的空间也会不一样。文件太大会占用较多的磁盘空间，传输及上传系统都非常不方便，可能会收到限制，因此我们经常会碰到需要对图片进行压缩的需求。如何…...

编程日记 2025/8/21 6:26:21

[Linux系统编程]多线程

多线程 1. 线程1.1 线程的概念1.2 进程与线程对比1.3 轻量级进程 2. Linux线程控制2.1 POSIX 线程（pthread）2.2 线程ID、pthread_t、和进程地址空间的关系2.2.1 pthread_self2.2.2 pthread_create2.2.3 pthread_join2.2.4 线程终止的三种方式2.2.5 pthre…...

编程日记 2025/8/22 5:10:32

进程状态(运行阻塞僵尸)及其场景分析

【Linux学习笔记】Linux基本指令及其分析 🔥个人主页：大白的编程日记 🔥专栏：Linux学习笔记前言哈喽，各位小伙伴大家好!上期我们讲了进程PCB 今天我们讲的是进程状态(运行阻塞僵尸)及其场景分析。话不多说&#…...

编程日记 2025/8/21 23:21:07

程序化广告行业（67/89）：DMP系统标签制作与人群拓展深度解析

程序化广告行业（67/89）：DMP系统标签制作与人群拓展深度解析大家好！在之前的分享中，我们对程序化广告的多个关键环节进行了探讨。今天，咱们继续深入了解程序化广告中的DMP系统，聚焦于标签制作和…...

编程日记 2025/8/21 23:50:26

【QT】QPixmap QImage QBitmap QPicture

文章目录 **1. QPixmap****特点****典型应用场景****示例** **2. QImage****特点****典型应用场景****示例** **3. QBitmap****特点****示例** **4. 三者的主要区别****5. 如何选择？****使用 QPixmap 的情况****使用 QImage 的情况****使用 QBitmap 的情况** **6. 相…...

编程日记 2025/8/22 10:04:56

如何开通google Free Tier长期免费云服务器(1C/1G)

Google宣布的一项政策，为标准层级的网络提供每地域200G的免费流量。两项政策结合，于是便可以得到一台1核心、1G内存、30G磁盘、200G流量的小云服务器，可玩性大大提高。这篇文章就分享一下如何正确开机，避免产生额外的费用。免费…...

编程日记 2025/8/22 2:23:33

Kaggle房价预测

实战 Kaggle 比赛：预测房价这里李沐老师讲的比较的细致，我根据提供的代码汇总了一下： import hashlib import os import tarfile import zipfile import requests import numpy as np import pandas as pd import torch from matplotlib i…...

编程日记 2025/8/20 18:37:25

相关文章：