Class RealtimeTruncationRetentionRatio
-
- All Implemented Interfaces:
public final class RealtimeTruncationRetentionRatioRetain a fraction of the conversation tokens when the conversation exceeds the input token limit. This allows you to amortize truncations across multiple turns, which can help improve cached token usage.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description public final classRealtimeTruncationRetentionRatio.BuilderA builder for RealtimeTruncationRetentionRatio.
public final classRealtimeTruncationRetentionRatio.TokenLimitsOptional custom token limits for this truncation strategy. If not provided, the model's default token limits will be used.
-
Method Summary
Modifier and Type Method Description final DoubleretentionRatio()Fraction of post-instruction conversation tokens to retain ( 0.0-1.0) when the conversation exceeds the input token limit.final JsonValue_type()Use retention ratio truncation. final Optional<RealtimeTruncationRetentionRatio.TokenLimits>tokenLimits()Optional custom token limits for this truncation strategy. final JsonField<Double>_retentionRatio()Returns the raw JSON value of retentionRatio. final JsonField<RealtimeTruncationRetentionRatio.TokenLimits>_tokenLimits()Returns the raw JSON value of tokenLimits. final Map<String, JsonValue>_additionalProperties()final RealtimeTruncationRetentionRatio.BuildertoBuilder()final RealtimeTruncationRetentionRatiovalidate()Validates that the types of all values in this object match their expected types recursively. final BooleanisValid()Booleanequals(Object other)IntegerhashCode()StringtoString()final static RealtimeTruncationRetentionRatio.Builderbuilder()Returns a mutable builder for constructing an instance of RealtimeTruncationRetentionRatio. -
-
Method Detail
-
retentionRatio
final Double retentionRatio()
Fraction of post-instruction conversation tokens to retain (
0.0-1.0) when the conversation exceeds the input token limit. Setting this to0.8means that messages will be dropped until 80% of the maximum allowed tokens are used. This helps reduce the frequency of truncations and improve cache rates.
-
_type
final JsonValue _type()
Use retention ratio truncation.
Expected to always return the following:
JsonValue.from("retention_ratio")However, this method can be useful for debugging and logging (e.g. if the server responded with an unexpected value).
-
tokenLimits
final Optional<RealtimeTruncationRetentionRatio.TokenLimits> tokenLimits()
Optional custom token limits for this truncation strategy. If not provided, the model's default token limits will be used.
-
_retentionRatio
final JsonField<Double> _retentionRatio()
Returns the raw JSON value of retentionRatio.
Unlike retentionRatio, this method doesn't throw if the JSON field has an unexpected type.
-
_tokenLimits
final JsonField<RealtimeTruncationRetentionRatio.TokenLimits> _tokenLimits()
Returns the raw JSON value of tokenLimits.
Unlike tokenLimits, this method doesn't throw if the JSON field has an unexpected type.
-
_additionalProperties
final Map<String, JsonValue> _additionalProperties()
-
toBuilder
final RealtimeTruncationRetentionRatio.Builder toBuilder()
-
validate
final RealtimeTruncationRetentionRatio validate()
Validates that the types of all values in this object match their expected types recursively.
This method is not forwards compatible with new types from the API for existing fields.
-
builder
final static RealtimeTruncationRetentionRatio.Builder builder()
Returns a mutable builder for constructing an instance of RealtimeTruncationRetentionRatio.
The following fields are required:
.retentionRatio()
-
-
-
-