- Overview
- Installation from APK
- Project Architecture
- Technology Stack
- On-Device AI Models
- Application Features
- Workflow and Data Flow
- Project Structure
- Setup and Installation
- Building and Running
- System Requirements
Kortex is a sophisticated Android photo editing app that offers professional level image editing capabilities by fusing cloud based AI services with on device machine learning models. The application leverages ONNX Runtime for efficient on device inference, GPU acceleration for real time adjustments, and offline speech recognition for voice-controlled editing.
Using Jetpack Compose for the UI layer, Kotlin Coroutines for asynchronous operations, and the MVVM (Model View ViewModel) architectural pattern for clear concern separation, the application is constructed using contemporary Android development techniques.
To try Kortex quickly, a pre-built release APK is included in this repository for easy installation.
Step 1: Enable Unknown Sources
Before installing the APK, you need to allow installation from unknown sources:
- Open Settings on your Android device
- Navigate to Security or Privacy (location varies by Android version)
- Find and enable Install unknown apps or Unknown sources
- Select your file manager or browser and allow installation from that source
Note: On Android 8.0 (Oreo) and above, you grant permission per app. On older versions, there's a global setting.
Step 2: Transfer the APK to Your Device
Choose one of these methods:
Method A: Direct Download (if shared online)
- Download the APK directly on your device from the shared link
- The APK will be saved to your Downloads folder
Method B: USB Transfer
- Connect your Android device to your computer via USB
- Enable File Transfer mode when prompted on your device
- Navigate to the project folder:
Kortex-app/apk/ - Copy
app-release.apkto your device's Downloads folder or internal storage
Method C: Cloud Transfer
- Upload
app-release.apkfromKortex-app/apk/folder to Google Drive, Dropbox, etc. - Download it on your Android device from the cloud service
Step 3: Install the APK
- Open your device's File Manager app
- Navigate to the folder where you saved the APK (usually Downloads)
- Tap on app-release.apk
- Review the permissions requested by the app:
- Storage (for saving/loading images)
- Camera (for taking photos)
- Microphone (for voice commands)
- Internet (for cloud AI features)
- Tap Install
- Wait for installation to complete (may take 30-60 seconds due to ML models)
- Tap Open to launch Kortex, or find it in your app drawer
Step 4: Grant Runtime Permissions
When you first use certain features, Android will ask for permissions:
- Storage/Photos: Required to edit images from your gallery
- Camera: Required to take new photos (optional)
- Microphone: Required for voice commands (optional)
- Internet: Required for cloud AI features (optional - app works offline)
The application uses the repository pattern for data management in conjunction with the MVVM (Model View ViewModel) architecture pattern:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β View Layer β
β (Jetpack Compose UI Components) β
β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β βPhotoEditor β β Adjust β β Background β β
β βScreen β β Screen β β Removal β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
ββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββββββββββ
β
β User Actions / State Observation
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β ViewModel Layer β
β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β PhotoEditorViewModel β β
β β β’ Manages UI state β β
β β β’ Handles user interactions β β
β β β’ Coordinates with repositories and executors β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
ββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββββββββββ
β
β Data Requests / Commands
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Data/Model Layer β
β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β Repositories β β ML Executors β β Utilities β β
β β β β β β β β
β β β’ Retouch β β β’ LaMa β β β’ Image β β
β β β’ CloudEdit β β β’ SAM β β β’ Watermark β β
β β β β β’ AutoEnhanceβ β β’ Font β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
User Input
β
βΌ
βββββββββββββββββββββ
β Compose UI β
β Components β
ββββββββββ¬βββββββββββ
β Events
βΌ
βββββββββββββββββββββββββββββ
β PhotoEditorViewModel β
β β’ State Management β
β β’ Business Logic β
βββββββ¬ββββββββββββββββββββββ
β
ββββββββββββββββββββ¬βββββββββββββββββββ¬βββββββββββββββββ
βΌ βΌ βΌ βΌ
βββββββββββββββ βββββββββββββββ ββββββββββββββ ββββββββββββββ
β On-Device β β Cloud β β GPU Image β β Local β
β ML Models β β APIs β β Processing β β Storage β
β β β β β β β β
β β’ LaMa β β β’ Retouch β β β’ Adjust β β β’ Files β
β β’ EdgeSAM β β β’ SmartFill β β β’ Filters β β β’ Cache β
β β’ AutoEnhc. β β β’ Sticker β β β β β
β β’ Vosk β β Harmonize β β β β β
βββββββββββββββ βββββββββββββββ ββββββββββββββ ββββββββββββββ
Language and Framework
- Kotlin 2.0.21
- Android SDK (Min SDK 24, Target SDK 34, Compile SDK 36)
- Jetpack Compose (Material3)
Build System
- Gradle 8.13.1 with Kotlin DSL
- Android Gradle Plugin 8.13.1
Architecture Components
- Lifecycle ViewModel Compose 2.7.0
- Kotlin Coroutines with Dispatchers
- StateFlow for reactive state management
Networking
- Retrofit 2.11.0 (REST API communication)
- OkHttp 4.12.0 (HTTP client with logging interceptor)
- Gson Converter 2.11.0 (JSON serialization)
Machine Learning
- ONNX Runtime Android 1.17.0 (On-device inference)
- Vosk Android 0.3.32 (Offline speech recognition)
- JNA 5.13.0 (Java Native Access)
Image Processing
- Coil 2.5.0 (Async image loading)
- GPUImage 2.1.0 (GPU-accelerated filters)
- ExifInterface 1.3.7 (Image metadata handling)
UI and Permissions
- Material Icons Extended
- Accompanist Permissions 0.32.0
The application uses multiple ONNX format neural network models that run entirely on the device without requiring internet connectivity. These models are stored in the assets folder and loaded at runtime.
File: lama_fp32.onnx
Purpose: Advanced image inpainting for object removal and content aware fill
Architecture:
Input: image (1x3x512x512) + mask (1x1x512x512)
β
βΌ
ββββββββββββββββββββββββ
β Fast Fourier Conv β
β Encoder-Decoder β
ββββββββββββ¬ββββββββββββ
βΌ
Output: inpainted_image (1x3x512x512)
Technical Specifications:
- Input Image Shape: [1, 3, 512, 512]
- Input Mask Shape: [1, 1, 512, 512]
- Output Shape: [1, 3, 512, 512]
- Precision: FP32
- Acceleration: NNAPI hardware acceleration when available (Android 8+)
Processing Pipeline:
Original Image β Resize to 512x512 β Normalize to [0,1]
β
Mask Image β Resize to 512x512 β Dilate (10px) β Binary threshold
β
βββββββββββββββββββββββββ΄ββββββββββββββββββ
βΌ βΌ
Image Tensor (CHW format) Mask Tensor
β β
ββββββββββββββββ¬βββββββββββββββββββββββββββ
βΌ
LaMa ONNX Inference
β
βΌ
Inpainted Result
β
βΌ
Denormalize β Resize to original size
Use Cases:
- Removing objects from pictures
- Cleaning up unwanted elements
- Manual mask-based inpainting
- Background object removal
Files: sam_encoder.onnx and sam_decoder.onnx
Purpose: Interactive image segmentation with point based selection
Two-Stage Architecture:
Stage 1: Encoder (Heavy, Run Once)
ββββββββββββββββββββββββββββββββββ
Input Image (1x3x1024x1024)
β
βΌ
βββββββββββββββββββββ
β Vision Transform β
β Encoder β
ββββββββββ¬βββββββββββ
βΌ
Image Embeddings (1x256x64x64)
β
ββββββββΊ Cache for reuse
- User clicks somewhere or creates a bounding box
Stage 2: Decoder (Lightweight, Interactive)
βββββββββββββββββββββββββββββββββββββββββββ
Cached Embeddings + Point Coords + Labels
β
βΌ
βββββββββββββββββββββββββ
β Mask Decoder β
β + Prompt Encoder β
βββββββββββββ¬ββββββββββββ
βΌ
Segmentation Masks
(1x1x256x256) Γ 4 variants
β
βΌ
IoU Scores (1x4)
Technical Specifications:
Encoder:
- Input Shape: [1, 3, 1024, 1024]
- Output Shape: [1, 256, 64, 64]
- Execution: Once per image
Decoder:
- Inputs:
- Image Embeddings: [1, 256, 64, 64]
- Point Coordinates: [1, N, 2]
- Point Labels: [1, N] (1=foreground, 0=background, -1=padding)
- Mask Input: [1, 1, 256, 256] (optional)
- Has Mask Input: [1] (boolean)
- Original Image Size: [2] (height, width)
- Outputs:
- Masks: [1, 4, 256, 256]
- IoU Predictions: [1, 4]
Processing Flow:
User loads image
β
βΌ
Run Encoder (slow, ~1-2 seconds)
β
βΌ
Cache embeddings in memory
β
βΌ
User taps on object βββββββ
β β
βΌ β
Transform tap coords β
to model space (1024x1024)β
β β
βΌ β
Run Decoder (fast, <100ms)β
β β
βΌ β
Select best mask by IoU β
β β
βΌ β
Resize to original size β
β β
βΌ β
Display segmentation β
β β
βββββββββββββββββββββββ
(User can tap again)
Use Cases:
- Background removal with tap selection
- Object isolation
- Quick mask generation
- Interactive segmentation
Files: analyzer_8param_v2.onnx and hdrnet_fixer_safe.onnx
Purpose: Automatic image quality analysis and parameter extraction
Dual-Model System:
Model 1: Analyzer (Parameter Extraction)
ββββββββββββββββββββββββββββββββββββββββ
Input Image (1x3x256x256)
β
βΌ
ββββββββββββββββββββββββ
β MobileViT Backbone β
β + Analysis Head β
ββββββββββββ¬ββββββββββββ
β
ββββΊ Edit Parameters (1x8)
β [exposure, contrast, saturation,
β brightness, highlights, shadows,
β temperature, sharpness]
β
ββββΊ Rationale Logits (1x4)
[underexposed, overexposed,
unsaturated, good]
Model 2: Fixer (Parameter Application)
βββββββββββββββββββββββββββββββββββββββ
Original Image + Parameters
β
βΌ
ββββββββββββββββββββββββ
β HDRNet Architecture β
β Bilateral Grid β
ββββββββββββ¬ββββββββββββ
βΌ
Enhanced Image
Technical Specifications:
Analyzer:
- Input Shape: [1, 3, 256, 256]
- Outputs:
- Parameters: [1, 8] float values
- Rationale: [1, 4] classification logits
- Parameter Ranges: Typically [-1, 1] or [0, 2]
Parameter Mapping:
Index 0: Exposure β Exposure adjustment
Index 1: Contrast β Local contrast
Index 2: Saturation β Color intensity
Index 3: Brightness β Overall Brightness
Index 4: Highlights β Bright region control
Index 5: Shadows β Dark region control
Index 6: Temperature β White balance (cool/warm)
Index 7: Sharpness β Edge enhancement
Analysis Workflow:
Input Image
β
βΌ
Resize to 256x256
β
βΌ
Normalize to [0,1]
β
βΌ
Run Analyzer Model
β
ββββΊ Extract 8 parameters
β β
β βΌ
β Map to adjustment sliders
β β
β βΌ
β Apply via GPU filters
β
ββββΊ Parse rationale
β
βΌ
Display diagnosis
[Softmax classification]
Use Cases:
- Automatic image enhancement suggestions
- Quality analysis
- Parameter extraction for manual adjustment
Directory: vosk-model-small-en-us-0.15/
Purpose: Offline voice command recognition for hands free editing
Model Type: language model (Kaldi based)
Architecture Overview:
Audio Input (16kHz PCM)
β
βΌ
βββββββββββββββββββββββ
β Audio Preprocessingβ
β β’ Framing β
β β’ Feature Extract β
β β’ MFCC/Fbank β
βββββββββββ¬ββββββββββββ
βΌ
βββββββββββββββββββββββ
β Acoustic Model β
β (DNN/TDNN) β
βββββββββββ¬ββββββββββββ
βΌ
βββββββββββββββββββββββ
β Language Model β
β (N-gram/RNNLM) β
βββββββββββ¬ββββββββββββ
βΌ
Transcribed Text
Technical Specifications:
- Sample Rate: 16,000 Hz
- Model Size: Small (41MB approx)
- Language: English (US)
- Latency: Real-time streaming
- Output: JSON with partial and final results
Recognition Flow:
User presses mic button
β
βΌ
Request RECORD_AUDIO permission
β
βΌ
Initialize Vosk Model (if first time)
β
βΌ
Start SpeechService
β
βΌ
Audio Stream βββ
β
ββββββββββββΌββββββββββββββ
β Continuous Recognition β
β β’ Partial results β
β β’ Final results β
β β’ Auto-stop (5s) β
ββββββββββββ¬ββββββββββββββ
β
βΌ
Update chat interface
β
βΌ
Send to API or execute command
Use Cases:
- Voice activated controls to "make it brighter"
- Retouching based on instructions
- Hands free operation
- Accessibility features
Initialization Strategy:
Application Start
β
βΌ
MainActivity.onCreate()
β
βΌ
User selects feature
β
βΌ
Lazy initialization of required model
β
ββββΊ Copy from assets to cache (if needed)
β
ββββΊ Check device capabilities
β β’ NNAPI availability
β β’ GPU acceleration
β β’ Memory constraints
β
ββββΊ Configure OrtSession.SessionOptions
β β’ OptLevel.ALL_OPT
β β’ Add NNAPI provider (if available)
β
ββββΊ Create OrtSession
β
ββββΊ Model ready for inference
Memory Management:
ββββββββββββββββββββββββββββββββββββ
β Model Lifecycle β
ββββββββββββββββββββββββββββββββββββ€
β β
β Feature Activated β
β β β
β βΌ β
β Load model to memory β
β β β
β βΌ β
β Keep in memory during use β
β β β
β βΌ β
β User exits feature β
β β β
β βΌ β
β Release tensors β
β β β
β βΌ β
β Session persists (reusable) β
β β β
β βΌ β
β App background/destroy β
β β β
β βΌ β
β Full cleanup β
β β
ββββββββββββββββββββββββββββββββββββ
- Fine-tune exposure, brightness, and contrast
- Control highlights and shadows
- Adjust saturation, vibrance, and hue
- Set temperature/white balance
- Enhance sharpness and texture
- Smooth, real-time slider preview
- One-tap subject selection using edgeSAM
- Automatic background cutout
- Manual brush-based refinement
- Instant background replacement
- Smart edge clean-up
- Tap to remove unwanted objects
- LaMa powered content aware fill
- Manual mask painting
- Multi object removal
- Auto-dilation for smooth blending
- Edit using natural instructions (βmake it warmerβ)
- Reference image style transfer
- Offline voice commands
- Parameter extraction & visualization
- Smart filtering of relevant edits
- Fill or extend images using custom prompts
- Adjustable vibe/style strength
- AI-powered background replacement
- Automatic lighting/color harmonization
- AI generated content watermark
- Select objects with edgeSAM
- Drag to reposition anywhere
- Automatic inpainting of original area
- Harmonized shadows + lighting
- Optional manual mask refinement
- Popular aspect ratios (1:1, 4:3, 16:9, etc.)
- Freeform cropping
- Corner based perspective correction
- Rotation with angle display
- Grid overlay for precision
- Add customizable text
- Multiple font options
- Color, opacity, and size control
- Rotate, scale, and reposition
- Shadow effects
- Add visible watermarks
- Hidden LSB steganographic watermarking
- Optional βEdited by AIβ stamps
- Full font & size customization
- Brush based local adjustments
- Paint to select regions
- Adjustable brush size
- Preview before applying
1. Smart Fill API
- Endpoint for generative inpainting
- Prompt based content generation
- Configurable strength parameters
2. Harmonization API
- Blends moved objects naturally
- Matches lighting and color
3. AI Retouch API
- Reference based style transfer
- Instruction based parameter extraction
App Launch
β
βΌ
MainActivity.onCreate()
β
ββββΊ Initialize Compose UI
β
ββββΊ Create PhotoEditorViewModel
β
βΌ
PhotoEditorScreen (Idle State)
β
β User Action
βΌ
ββββββββββββββββββββββββββ
β Image Selection β
β β’ Gallery Picker β
β β’ Camera Capture β
ββββββββββ¬ββββββββββββββββ
β
βΌ
Load Image β Update ViewModel State
β
βΌ
ββββββββββββββββββββββββββββββββββββββ
β Main Editor View β
β β
β ββββββββββββββββ ββββββββββββββ β
β β Image Canvasβ βSide Panel β β
β β β β β β
β β β’ Display β ββ’ Adjust β β
β β β’ Interact β ββ’ AI Tools β β
β β β’ Transform β ββ’ Effects β β
β ββββββββββββββββ ββββββββββββββ β
β β
β ββββββββββββββββββββββββββββββββ β
β β Top App Bar β β
β β β’ Undo/Redo β β
β β β’ Save/Export β β
β ββββββββββββββββββββββββββββββββ β
ββββββββββββββββββββββββββββββββββββββ
User selects feature from Side Panel
β
βββββββββββββ΄βββββββββββββ¬ββββββββββββββββββ
βΌ βΌ βΌ
Adjust Mode AI Tool Mode Transform Mode
β β β
βΌ βΌ βΌ
Load AdjustScreen Initialize Model Enter Crop/Rotate
β β β
βΌ βΌ βΌ
Display sliders Wait for user input Interactive overlay
β β β
βΌ βΌ βΌ
Real-time preview Process inference Preview transform
β β β
βΌ βΌ βΌ
Apply on confirm Update image Apply on confirm
Original Image URI
β
βΌ
Load to Bitmap
β
βΌ
βββββββββββββββββββββββββββββββββββββ
β Processing Selection β
βββββββββββββββββββββββββββββββββββββ€
β β
β βββββββββββββββββββββββββββββββ β
β β On-Device Processing β β
β β β β
β β β’ Adjustments (GPU) β β
β β β’ LaMa Inpainting β β
β β β’ edgeSAM Segmentation β β
β β β’ Auto Enhance β β
β ββββββββββββ¬βββββββββββββββββββ β
β β β
β βΌ β
β Process locally β
β β β
β βΌ β
β Result Bitmap β
β β
β βββββββββββββββββββββββββββββββ β
β β Cloud Processing β β
β β β β
β β β’ Smart Fill β β
β β β’ Harmonization β β
β β β’ AI Retouch β β
β ββββββββββββ¬βββββββββββββββββββ β
β β β
β βΌ β
β Upload via Retrofit β
β β β
β βΌ β
β Wait for response β
β β β
β βΌ β
β Download result β
βββββββββββββ¬ββββββββββββββββββββββ¬ββ
β β
βΌ βΌ
Save to cache Update UI
β β
βΌ βΌ
Generate URI Display result
User Interaction
β
βΌ
Event Handler in Composable
β
βΌ
Call ViewModel method
β
βΌ
ββββββββββββββββββββββββββββββββββ
β PhotoEditorViewModel β
β β
β 1. Validate action β
β 2. Update state variables β
β 3. Trigger coroutine β
β 4. Launch processing β
β 5. Handle result β
β 6. Update state again β
ββββββββββ¬ββββββββββββββββββββββββ
β
βΌ
State changes trigger recomposition
β
βΌ
UI updates automatically
β
βΌ
User sees result
Initial State: currentImageUri = null, history = []
β
βΌ
User loads image β history = [uri1]
β
βΌ
User applies adjustment β history = [uri1, uri2]
β
βΌ
User applies crop β history = [uri1, uri2, uri3]
β
βΌ
User clicks Undo
β
βΌ
Pop from history β Display uri2
β
βΌ
User clicks Undo again
β
βΌ
Pop from history β Display uri1
β
βΌ
User applies new edit
β
βΌ
Clear forward history β history = [uri1, uri4]
Kortex/
β
βββ app/
β βββ build.gradle.kts Build configuration
β βββ proguard-rules.pro Code obfuscation rules
β β
β βββ src/
β βββ main/
β β βββ AndroidManifest.xml
β β β
β β βββ java/test1/example/finalapp/
β β β β
β β β βββ MainActivity.kt Main entry point
β β β βββ PhotoEditorViewModel.kt Central state manager
β β β βββ LamaExecutor.kt LaMa model executor
β β β βββ ZimExecutor.kt edgeSAM model executor
β β β β
β β β βββ View/
β β β β βββ screens/
β β β β β βββ PhotoEditorScreen.kt Main editor UI
β β β β β
β β β β βββ components/
β β β β βββ BackgroundRemovalScreen.kt
β β β β βββ MoveObjectPlacementScreen.kt
β β β β βββ ManualMaskPainter.kt
β β β β βββ CropComponents.kt
β β β β βββ AspectRatioCropEngine.kt
β β β β βββ CornerCropEngine.kt
β β β β βββ RotationEngine.kt
β β β β βββ WatermarkComponents.kt
β β β β βββ LSBWatermarkScreen.kt
β β β β βββ TextSticker.kt
β β β β βββ TextEditingScreen.kt
β β β β βββ SidePanel.kt
β β β β βββ TopAppBar.kt
β β β β βββ GenerativeFillDialog.kt
β β β β
β β β βββ ui/
β β β β βββ theme/
β β β β β βββ Color.kt
β β β β β βββ Theme.kt
β β β β β βββ Type.kt
β β β β β βββ AppTheme.kt
β β β β β
β β β β βββ adjust/
β β β β βββ AdjustSheetComposable.kt
β β β β βββ AdjustIntegrationExample.kt
β β β β
β β β βββ component/
β β β β βββ adjust/
β β β β βββ AdjustEngine.kt GPU filter engine
β β β β βββ AdjustViewModel.kt
β β β β βββ AdjustParams.kt
β β β β βββ PreviewRenderer.kt
β β β β βββ CurveEditorView.kt
β β β β βββ ToneCurveData.kt
β β β β
β β β βββ ml/
β β β β βββ AutoEnhanceExecutor.kt Auto enhance model
β β β β
β β β βββ audio/
β β β β βββ OfflineSpeechManager.kt Vosk integration
β β β β
β β β βββ data/
β β β β βββ api/
β β β β β βββ RetrofitClient.kt
β β β β β βββ RetouchApiService.kt
β β β β β βββ CloudEditNetwork.kt
β β β β β βββ CloudEditApiService.kt
β β β β β
β β β β βββ repository/
β β β β β βββ RetouchRepository.kt
β β β β β βββ CloudEditRepository.kt
β β β β β
β β β β βββ model/
β β β β βββ RetouchModels.kt
β β β β βββ AdjustState.kt
β β β β
β β β βββ utils/
β β β β βββ ImageUtils.kt Image operations
β β β β βββ AIWatermarkHelper.kt Watermarking
β β β β βββ LSBWatermarkUtil.kt Steganography
β β β β βββ RetouchUtils.kt
β β β β βββ FontManager.kt
β β β β
β β β βββ model/
β β β βββ AiEditState.kt
β β β
β β βββ assets/
β β β βββ lama_fp32.onnx LaMa model
β β β βββ sam_encoder.onnx edgeSAM encoder
β β β βββ sam_decoder.onnx edgeSAM decoder
β β β βββ analyzer_8param_v2.onnx Auto enhance analyzer
β β β βββ hdrnet_fixer_safe.onnx Auto enhance fixer
β β β βββ vosk-model-small-en-us-0.15/ Speech model
β β β βββ fonts/ Custom fonts
β β β
β β βββ res/
β β βββ values/
β β βββ drawable/
β β βββ mipmap/
β β βββ xml/
β β
β βββ androidTest/ Instrumented tests
β βββ test/ Unit tests
β
βββ gradle/
β βββ libs.versions.toml Dependency catalog
β βββ wrapper/
β βββ gradle-wrapper.jar
β βββ gradle-wrapper.properties
β
βββ build.gradle.kts Root build script
βββ settings.gradle.kts Project settings
βββ gradle.properties Gradle configuration
βββ gradlew Gradle wrapper (Unix)
βββ gradlew.bat Gradle wrapper (Windows)
βββ local.properties Local SDK path
Before setting up the project, ensure you have the following installed:
1. Development Environment
- Android Studio Iguana (2023.2.1) or newer
- JDK 11 or higher
- Minimum 8 GB RAM (16 GB recommended)
- At least 10 GB free disk space
2. Android SDK
- Android SDK Platform 34
- Android SDK Build-Tools 34.0.0 or higher
- Android SDK Platform-Tools
- Android Emulator (if testing on emulator)
3. Git
- Git version control system
extract the content of the zip file and take out the Kortex directory
1. Create local.properties file
Create a file named local.properties in the root directory with your Android SDK path:
sdk.dir=C:\\Users\\YourUsername\\AppData\\Local\\Android\\Sdk
On macOS/Linux:
sdk.dir=/Users/YourUsername/Library/Android/sdk
2. Verify Model Files
Ensure all ONNX model files are present in app/src/main/assets/:
- lama_fp32.onnx
- sam_encoder.onnx
- sam_decoder.onnx
- analyzer_8param_v2.onnx
- hdrnet_fixer_safe.onnx
- vosk-model-small-en-us-0.15/ (directory with model files)
3. Configure API Endpoints (Optional)
If using cloud features, update the API base URLs in:
app/src/main/java/test1/example/finalapp/data/api/RetrofitClient.ktapp/src/main/java/test1/example/finalapp/data/api/CloudEditNetwork.kt
Open the project in Android Studio and let Gradle sync automatically. If it doesn't start:
- Click "File" > "Sync Project with Gradle Files"
- Wait for dependencies to download
- Resolve any errors that appear
All dependencies are managed through Gradle and will be downloaded automatically during sync. Key dependencies include:
- Jetpack Compose libraries
- ONNX Runtime Android
- Retrofit and OkHttp
- GPUImage
- Vosk Android
- Coil image loader
The project supports two build variants:
Debug Build
- Includes debugging symbols
- Logging enabled
- No code obfuscation
- Faster build times
Release Build
- Code optimization enabled
- ProGuard rules applied
- Smaller APK size
- Production-ready
1. Create/Start Emulator
In Android Studio:
- Tools > Device Manager
- Create a new virtual device or start existing one
- Recommended: Pixel 5 with API 34 (Android 14)
- Enable "Hardware" for Graphics (for GPU acceleration)
2. Run the App
Click the "Run" button (green play icon)
or
Select Run > Run 'app'
or
Press Shift+F10 (Windows/Linux) or Control+R (macOS)
1. Enable Developer Options
On your Android device:
- Go to Settings > About Phone
- Tap "Build Number" 7 times
- Go back to Settings > System > Developer Options
- Enable "USB Debugging"
2. Connect Device
- Connect device via USB
- Accept USB debugging prompt on device
- Device should appear in Android Studio device dropdown
3. Run the App
Select your device from the dropdown and click Run
Debug APK
./gradlew assembleDebug
Output location: app/build/outputs/apk/debug/app-debug.apk
Release APK
./gradlew assembleRelease
Output location: app/build/outputs/apk/release/app-release.apk
Note: Release builds require signing configuration
Windows
gradlew.bat clean
gradlew.bat assembleDebug
Out of Memory Error
Increase heap size in gradle.properties:
org.gradle.jvmargs=-Xmx4096m -Dfile.encoding=UTF-8
Duplicate Class Errors
The project already handles libc++_shared.so conflicts in build.gradle.kts with pickFirsts configuration
Model Loading Failures
Ensure all ONNX files are in the assets folder and not compressed by adding to build.gradle.kts:
android {
aaptOptions {
noCompress "onnx"
}
}
NNAPI Errors
Some devices may not support NNAPI. The code gracefully falls back to CPU execution
Device
- Android 7.0 (API 24) or higher
- 3 GB RAM
- 1 GB free storage
- ARMv7 or ARM64 processor
Permissions
- READ_EXTERNAL_STORAGE
- WRITE_EXTERNAL_STORAGE (Android 9 and below)
- READ_MEDIA_IMAGES (Android 13+)
- RECORD_AUDIO (for voice commands)
- CAMERA (for camera capture)
- INTERNET (for cloud features)
Device
- Android 12.0 (API 31) or higher
- 6 GB RAM or more
- 2 GB free storage
- ARM64 processor
- GPU with OpenGL ES 3.0 or higher
- NNAPI support for hardware acceleration
Performance Notes
Model inference times (approximate, varies by device):
Budget Device (Snapdragon 600 series):
βββ LaMa Inpainting: 3-5 seconds
βββ edgeSAM Encoder: 2-3 seconds
βββ edgeSAM Decoder: 200-300 ms
βββ Auto Enhance: 1-2 seconds
Mid-range Device (Snapdragon 700 series):
βββ LaMa Inpainting: 1-2 seconds
βββ edgeSAM Encoder: 1-1.5 seconds
βββ edgeSAM Decoder: 100-150 ms
βββ Auto Enhance: 500-800 ms
Flagship Device (Snapdragon 8 series):
βββ LaMa Inpainting: 0.5-1 second
βββ edgeSAM Encoder: 0.5-0.8 seconds
βββ edgeSAM Decoder: 50-80 ms
βββ Auto Enhance: 200-400 ms
App Size
- APK: Approximately 150-200 MB
- ONNX Models: ~180 MB
- Vosk Model: ~50 MB
- Code and Resources: ~20 MB
Runtime Storage
- Cache: 50-500 MB (varies with usage)
- Temporary files: 100-1000 MB (high-resolution editing)
- User images: Depends on usage
Optional (for cloud features)
- Stable internet connection
- Recommended: 5 Mbps or higher
- Cloud API access
Fully Functional Offline
- All core editing features
- On-device AI models
- Voice recognition
- No internet required for basic operation