Skyfall 31B v4.2: Uncensored Roleplay AI Model

Creative writers and roleplayers have long struggled with AI models that refuse character interactions, censor dialogue mid-conversation, or break immersion with safety warnings during fictional scenarios. Skyfall 31B v4.2 addresses this friction by removing content filters while maintaining coherent narrative generation across extended roleplay sessions.

This 31-billion parameter model builds on the Qwen architecture with specialized fine-tuning for creative fiction and character-driven dialogue. Unlike base models trained primarily on factual content, Skyfall prioritizes narrative consistency and character voice preservation across multi-turn conversations.

Key Specs

Skyfall 31B v4.2 runs on the Qwen2.5-32B foundation with approximately 31 billion parameters. The model requires 20GB of VRAM for full-precision inference or 12GB when quantized to 4-bit using GGUF format. Quantized versions maintain 95% of the original model’s coherence while fitting consumer GPUs.

The context window extends to 32,768 tokens, allowing roughly 24,000 words of conversation history before older messages drop from memory. This capacity supports novel-length roleplay sessions without losing character details or plot threads established earlier in the conversation.

Response generation averages 18-25 tokens per second on an RTX 4090 at 4-bit quantization. The model accepts standard ChatML formatting and works with popular inference engines including KoboldCpp, Text Generation WebUI, and LM Studio.

Training data includes fiction archives, screenplay databases, and curated roleplay logs. The v4.2 release specifically improved handling of multiple characters in group scenes and reduced repetition in extended conversations beyond 15,000 tokens.

Who Benefits

Fiction writers use Skyfall for brainstorming character interactions without content restrictions interrupting creative flow. The model generates dialogue that maintains distinct voices for different characters rather than defaulting to generic responses.

Tabletop RPG groups integrate the model as a dynamic game master assistant. It generates NPC dialogue, describes environments, and responds to player actions while remembering campaign details from earlier sessions. The extended context window tracks multiple plot threads simultaneously.

Interactive fiction developers prototype narrative branches and test dialogue trees. The model’s consistency across long conversations helps identify plot holes and character inconsistencies before committing to final writing.

Roleplaying communities on platforms like Discord and forums employ Skyfall for collaborative storytelling. Multiple users contribute to shared narratives while the model fills supporting character roles and maintains story continuity.

Quick Start

Download the model from https://huggingface.co/Skyfall-AI/Skyfall-31B-v4.2 in either full precision (62GB) or 4-bit quantized format (18GB). The Q4_K_M quantization offers the best balance between quality and resource requirements for most users.

Install KoboldCpp or Text Generation WebUI as your inference engine. For KoboldCpp, launch with these settings:

python koboldcpp.py --model skyfall-31b-v4.2-q4.gguf --contextsize 32768 --gpulayers 41 --usecublas

Configure the sampling parameters for optimal roleplay output. Temperature between 0.85-1.1 produces creative responses without excessive randomization. Set repetition penalty to 1.08 and top-p to 0.9.

Format prompts using ChatML structure with system messages defining the scenario and character cards:

<|im_start|>system
You are roleplaying as [character name]. [Character description and personality traits]
<|im_end|>
<|im_start|>user
[User's roleplay action or dialogue]
<|im_end|>
<|im_start|>assistant

The model generates responses in character without requiring additional prompting about content policies or fictional context.

Alternatives

Mythomax 13B offers similar uncensored capabilities in a smaller package requiring only 8GB VRAM. The reduced parameter count limits character consistency in conversations exceeding 10,000 tokens but runs on mid-range hardware.

Nous Hermes 2 Pro focuses on instruction-following with minimal content filtering. While not specifically tuned for roleplay, it handles creative writing tasks competently and includes function-calling capabilities for tool integration.

MythoMist 7B combines multiple models through merge techniques to balance creativity and coherence. The 7-billion parameter size enables CPU-only inference on systems with 16GB RAM, though generation speed drops to 2-4 tokens per second.

Claude 3 Opus provides superior writing quality through Anthropic’s API but applies content policies that interrupt certain fictional scenarios. The $15 per million tokens pricing makes extended roleplay sessions expensive compared to locally-run alternatives.

Each option trades off between model size, hardware requirements, content restrictions, and output quality based on specific use cases and available resources.

Skyfall 31B v4.2: Uncensored Roleplay Model

Skyfall 31B v4.2: Uncensored Roleplay AI Model

Key Specs

Who Benefits

Quick Start

Alternatives

Related Tips

AI Giants Unite to Combat Chinese Model Theft

AI Models as RPG Characters: A New Framework

Auto-Rename Images with AI Vision & Live Reasoning