unvailai
/

3DKX_1.0b

Model card Files Files and versions Community

3DKX_1.0b / README.md

H&A Models

Update README.md

b00ee7c over 1 year ago

|

3.1 kB

	---
	model:
	base_learning_rate: 1.0e-04
	target: ldm.models.diffusion.ddpm.LatentDiffusion
	params:
	linear_start: 0.00085
	linear_end: 0.0120
	num_timesteps_cond: 1
	log_every_t: 200
	timesteps: 1000
	first_stage_key: "jpg"
	cond_stage_key: "txt"
	image_size: 64
	channels: 4
	cond_stage_trainable: false # Note: different from the one we trained before
	conditioning_key: crossattn
	monitor: val/loss_simple_ema
	scale_factor: 0.18215
	use_ema: False

	scheduler_config: # 10000 warmup steps
	target: ldm.lr_scheduler.LambdaLinearScheduler
	params:
	warm_up_steps: [ 10000 ]
	cycle_lengths: [ 10000000000000 ] # incredibly large number to prevent corner cases
	f_start: [ 1.e-6 ]
	f_max: [ 1. ]
	f_min: [ 1. ]

	unet_config:
	target: ldm.modules.diffusionmodules.openaimodel.UNetModel
	params:
	image_size: 32 # unused
	in_channels: 4
	out_channels: 4
	model_channels: 320
	attention_resolutions: [ 4, 2, 1 ]
	num_res_blocks: 2
	channel_mult: [ 1, 2, 4, 4 ]
	num_heads: 8
	use_spatial_transformer: True
	transformer_depth: 1
	context_dim: 768
	use_checkpoint: True
	legacy: False

	first_stage_config:
	target: ldm.models.autoencoder.AutoencoderKL
	params:
	embed_dim: 4
	monitor: val/rec_loss
	ddconfig:
	double_z: true
	z_channels: 4
	resolution: 256
	in_channels: 3
	out_ch: 3
	ch: 128
	ch_mult:
	- 1
	- 2
	- 4
	- 4
	num_res_blocks: 2
	attn_resolutions: []
	dropout: 0.0
	lossconfig:
	target: torch.nn.Identity

	cond_stage_config:
	target: ldm.modules.encoders.modules.FrozenCLIPEmbedder
	---

	Model name: H&A 3DKX
	Model version: 1.0b

	## Description:

	SFW model with limited nsfw capabilities (suggestive nsfw) that is highly versatile for 3D renders.
	The model has the particularity of splitting itself into two different well balanced styles.
	If you'd like to have your 3D characters have a more "Cartoony" face, you simply start your prompt
	with "3d cartoon of", and if you want the classic 3D render style, you write "a 3d render of".

	## Dataset:
	- between 140 and 180 pictures of 3D render of all kind

	## Has a high success rate at:
	- sfw portraits, full body poses, close ups, etc
	- high versatility in terms of outputs, it isn't locked to perform well on portraits
	- Landscapes, cyberpunk, steampunk, natural, scifi, etc
	- 2B Nier Automata (Don't ask us why)
	- different body types - different ethnicity
	- nsfw portraits, full body poses, close ups, etc

	## What it "In theory" shouldn't exceed at:
	- anything outside the scope of portraits, people, landscapes, game artworks, 3D sculptures, 3D fantasy, 3D film stills, etc
	- celebrities
	- highly specific animated cartoon characters
	- multiple subjects
	- highly specific video-game characters
	- pornography, genitalia and highly explicit materials