gemma 3n android studio
can i add gemma 3n to my android studio project?
please some info and advice
thankyou
Hi @kevingamaliel ,
Yes, you absolutely can add Gemma 3N to your Android Studio project! Google has designed Gemma 3N specifically for efficient on-device execution, and they provide the necessary tools and resources through Google AI Edge.
You can use the Google AI Edge tools and libraries (specifically the LLM Inference API) to load and run a pre-trained Gemma 3N model directly on your Android device.
Kindly follow this link , here step by step explained , if you have any concerns let us know will assist you. Thank you.
hello,
I'm encountering a persistent initialization issue with MediaPipeLlmBackend
in an Android RAG setup.
Setup:
- Libs:**
com.google.mediapipe:tasks-genai:0.10.24
,com.google.ai.edge.localagents:localagents-rag:0.2.0
- Model:**
gemma-3n-E4B-it-int4.task
(~4.4GB, SHA1:12ec504f5e1f4f1039faeff35d0c8f36a0befc09
), loaded from the app's internal files directory. - Device:** Samsung S24 Ultra
- Architecture:** A
SharedLlmBackend
singleton (with reference counting) manages a singleMediaPipeLlmBackend
instance. Both myGemmaSDKService
(for direct chat) andAiEdgeRagService
(for RAG) use this singleton'sacquire()
andrelease()
methods.
Problem:
While SharedLlmBackend
successfully creates the MediaPipeLlmBackend
object, and AiEdgeRagService
acquires this instance, the RAG chain's dummy inference calls during its initialization polling loop (10+ retries, >30 seconds) consistently get the W/MediaPipeLlmBackend: LLM inference is not initialized yet!
warning. This prevents AiEdgeRagService
from becoming ready.
Key Code Snippets:
1.
// In SharedLlmBackend.kt
object SharedLlmBackend {
private var backendInstance: MediaPipeLlmBackend? = null
private var currentModelPath: String? = null
private var referenceCount = 0
private val mutex = Mutex()
suspend fun acquire(context: Context, path: String): MediaPipeLlmBackend? = mutex.withLock {
Log.i(TAG, "acquire() called for path: \"$path\". Current backend: \"$currentModelPath\", refCount: $referenceCount")
if (backendInstance != null && currentModelPath == path) {
referenceCount++
Log.i(TAG, "Returning existing backend for \"$path\". New refCount: $referenceCount")
return@withLock backendInstance
}
if (backendInstance != null && currentModelPath != path) { // Path changed
Log.w(TAG, "Path changed. Closing existing backend for \"$currentModelPath\".")
try { backendInstance?.close() } catch (e: Exception) { Log.e(TAG, "Error closing previous backend for \"$currentModelPath\": ${e.message}", e) }
}
backendInstance = null; currentModelPath = null; referenceCount = 0;
Log.i(TAG, "Attempting to create new MediaPipeLlmBackend for path: \"$path\"")
return try {
val llmOptions = LlmInference.LlmInferenceOptions.builder()
.setModelPath(path).setMaxTokens(1024).build() // Using 1024 as per user's AiEdgeRagService
val sessionOptions = LlmInferenceSession.LlmInferenceSessionOptions.builder().build()
val newBackend = MediaPipeLlmBackend(context.applicationContext, llmOptions, sessionOptions)
backendInstance = newBackend
currentModelPath = path
referenceCount = 1
Log.i(TAG, "Successfully created MediaPipeLlmBackend for \"$path\". New refCount: $referenceCount")
newBackend
} catch (e: Exception) {
Log.e(TAG, "CRITICAL: Failed to create MediaPipeLlmBackend for \"$path\": ${e.message}", e)
backendInstance = null; currentModelPath = null; referenceCount = 0;
null
}
}
// ... include release() and close() methods ...
}
2.
// In AiEdgeRagService.initialize()
mediaPipeLlmBackendForRag = SharedLlmBackend.acquire(application.applicationContext, gemmaModelPathForRag!!)
if (mediaPipeLlmBackendForRag == null) {
Log.e(TAG, "[initialize] CRITICAL: SharedLlmBackend.acquire returned null for path: $gemmaModelPathForRag")
throw IllegalStateException("SharedLlmBackend.acquire returned null")
}
Log.i(TAG, "[initialize] Successfully acquired MediaPipeLlmBackend from SharedLlmBackend.")
retrievalAndInferenceChain = RetrievalAndInferenceChain(ChainConfig.create(mediaPipeLlmBackendForRag!!, /* ... */))
Log.i(TAG, "[initialize] RetrievalAndInferenceChain READY.")
var ragBackendChainReady = false
// DEFAULT_MAX_RETRIES = 10, DEFAULT_RETRY_DELAY_MS = 3000L
repeat(DEFAULT_MAX_RETRIES) { tryIdx ->
Log.d(TAG, "[initialize] Attempting dummy RAG inference #${tryIdx + 1}...")
try {
val dummyRequest = RetrievalRequest.create("Test RAG readiness.", RetrievalConfig.create(1, 0.1f, RetrievalConfig.TaskType.YoutubeING))
val response = retrievalAndInferenceChain!!.invoke(dummyRequest, null).await()
val responseText = response.text?.trim()
Log.d(TAG, "[initialize] Dummy RAG inference result #${tryIdx + 1}: '${responseText?.take(100)}...'")
if (responseText?.contains("not initialized", ignoreCase = true) == false && !responseText.isNullOrBlank()) {
ragBackendChainReady = true; Log.i(TAG, "[initialize] Dummy RAG inference SUCCEEDED on try #${tryIdx + 1}."); return@repeat
}
Log.w(TAG, "[initialize] Dummy RAG inference NOT READY (try #${tryIdx + 1}). Response: $responseText")
} catch (ex: Exception) { Log.w(TAG, "[initialize] Dummy RAG inference FAILED on try #${tryIdx + 1}: ${ex.message}", ex) }
if (!ragBackendChainReady && tryIdx < DEFAULT_MAX_RETRIES - 1) delay(DEFAULT_RETRY_DELAY_MS)
}
if (!ragBackendChainReady) {
Log.e(TAG, "[initialize] CRITICAL: RAG backend NOT READY after polling!")
resetAllVars()
return@withContext false
}
Key Log Snippets :
I/SharedLlmBackend: acquire() called for path: "/data/user/0/com.your.app/files/gemma-3n-E4B-it-int4.task". Current backend path: "null", refCount: 0
I/SharedLlmBackend: Attempting to create new MediaPipeLlmBackend instance for path: "/data/user/0/com.your.app/files/gemma-3n-E4B-it-int4.task"
I/MediaPipeLlmBackend: Constructor.
I/SharedLlmBackend: Successfully created and acquired new MediaPipeLlmBackend for "/data/user/0/com.your.app/files/gemma-3n-E4B-it-int4.task". New refCount: 1
I/AiEdgeRagService: [initialize] Successfully acquired/created MediaPipeLlmBackend instance from SharedLlmBackend.
I/AiEdgeRagService: [initialize] RetrievalAndInferenceChain READY.
D/AiEdgeRagService: [initialize] Attempting dummy RAG inference #1...
W/MediaPipeLlmBackend: LLM inference is not initialized yet!
D/AiEdgeRagService: [initialize] Dummy RAG inference result #1: 'LLM inference is not initialized yet!...'
W/AiEdgeRagService: [initialize] Dummy RAG inference NOT READY (try #1). Response: LLM inference is not initialized yet!
... (repeats for multiple retries) ...
E/AiEdgeRagService: [initialize] CRITICAL: RAG inference chain (MediaPipeLlmBackend) NOT READY after polling!
Questions:
- Is there a known characteristic of the
gemma-3n-E4B-it-int4.task
model (or large on-device LLMs generally via theLlmInference
API) that would cause such a long delay (>30-60 seconds) for the internal inference engine to become fully operational after theMediaPipeLlmBackend
object is constructed on Android (e.g., on a capable device like an S24 Ultra)? - Is there a more direct way to check/ensure the readiness of the
LlmInference
engine managed byMediaPipeLlmBackend
, other than attempting an inference call? - Are there specific
LlmInference.LlmInferenceOptions
(e.g., related to delegates, model loading parameters) orLlmInferenceSession.LlmInferenceSessionOptions
that are recommended or critical for robust and timely initialization of this model size/type on Android? (Note: Previous attempts to useBaseOptions.setDelegate()
withLlmInferenceOptions
led to "unresolved reference" fortasks.core.Delegate
.)
We've confirmed the MediaPipeLlmBackend
object is created and shared via the singleton. The primary issue appears to be the internal readiness of the LLM engine itself within that backend.
Any insights or diagnostic suggestions would be greatly appreciated.
Thank you!