Hi there, do any of you know of an in-depth guide/tutorial to ComfyUI? I'm looking for something that goes beyond the general setup and talks about the different nodes, where in the workflow they should be placed and so on. Basically a resource that shows me how to build advanced workflows without too much trial and error.
I've been using ComfyUI, but I feel like I'm doing it blindfold and I don't know what screws there are to adjust.
After a bumpy start (see my other thread about it), I start to feel a bit comfortable with SDXL to the point that I probably wont look back at the 1.5 models. This wizard-hat wearing cat was generated in A1111 with:
"a cute kitty cat wearing a wizard hat, candy rays beaming out of the cat ears, (a swirling galaxy of candy pops background:0.7), 1980's style digital art, hyperrealistic, paintbrush, shallow depth of field, bokeh, spotlight on face, cinematic lighting " Negative (from a standard style I use): "(bad anatomy:1.1), (high contrast:1.3), watermark, text, inscription, signature, canvas frame, (over saturated:1.2), (glossy:1.1), cartoon, 3d, ((disfigured)), ((bad art)), ((b&w)), blurry, ((bad anatomy)), (((bad proportions))), ((extra limbs)), cloned face, (((disfigured))), extra limbs, (bad anatomy), gross proportions, (malformed limbs), ((missing arms)), ((missing legs)), (((extra arms))), (((extra legs))), mutated hands, (fused fingers), (too many fingers), (((long neck))), Photoshop, video game, ugly, tiling, poorly drawn hands, 3d render
Generated at 1024x1024 without refiner.
There's a few things to be aware of, when working with SDXL in A1111 that I found:
- make sure you upgraded A1111 to version 1.5.1 (do a "git pull" in the install directory)
- I needed to add "--medvram" to my command line arguments, otherwise I'd get out of memory errors (12 GB VRAM)
- make sure you have your VAE as "automatic" or using the SDXL VAE (can be downloaded from huggingface). Older VAE's wont work
- older LoRa don't work and you will get errors
- there is a noise offset LoRa for SDXL (sd_xl_offset_example-lora_1.0) which does work, but I don't see too much difference in the images. With LoRa they are a tiny bit crisper. However, this LoRa doesn't work with the Refiner model (you will get errors)
And the biggest one for me:
- don't use arbitrary image proportions and stick to the ones posted here: https://platform.stability.ai/docs/features/api-parameters This was the biggest mistake I made initially. By using other image sizes I'd get super wonky images and very unsatisfying results. I stick to the recommended dimensions and now my images are much, much better.
A word to the refiner model: as of now I don't see big quality improvements if I go with the refiner model in img2img @about 0.1 - 0.25 denoising. I think I will play around more with this at higher denoising strength and see what I can get out of it.
Anyway, I think the SDXL is a huge improvement and I start getting really exciting results already Cheers :)
The prompt was just an example and usually my prompts get quite a bit longer than that. But in 1.5 models I manage to get what I want to see eventually. I also find that throwing in qualifiers like "mesmerizing" does do something to the image, although in can be subtle.
However, what I wanted to say here was that in SDXL my prompting seems to go to nowhere and I feel I'm not able to get out the kind of image I have in my head. Keeping the prompt example, in SD1.5 using a custom model like Deliberate 2.0 I'm able to end up with an image of a hat wearing cat surround by surreal looking candy pops. (however the final prompt for this reads). In SDXL my images "break" (i.e start looking flat, unrefined or even bizarre) at some point long before I can direct them towards my imagined result. All my usual approaches like reducing CFG, re-ordering prompts, using a variety of qualifierts don't seem to work like I'm used to.
And tbh, I think this has to be expected. These are new models, so we need new toools (prompts) to work with them. I just haven't learned how to do it yet and I'm asking how others do it :)
Hi,
I'm looking into hosting a blog site for myself - nothing fancy, just a site where I can publish some of my thoughts and ideas. Maybe I also want a section to publish images. So, basically something lean and mostly text only.
What's the easiest way to set this up for myself?
Hi,
I'm a bit struggling to get good results with SDXL and I'm wondering if I do something wrong ... I tried A1111 and ComfyUI and have been underwhelmed in both cases. I can get bland looking boring images out of it, which seem to be ok from a technical point of view (like they seem to be correctly generated, without weird artifacts or something like that). However, whenever I try to get something more elaborate my prompting leeds to nowhere. Like I can get "a cat" and it will generate a picture of a cat. But if I try to get "a cat wearing a wizard hat floating in a mesmerizing galaxy of candy pops" - these kind of prompts seem to quickly break the final image. I'm not talking about tailored models and LoRa here, but I seem to be able to do much more interesting stuff with the Deliberate 2.0 model than with SDXL.
So, what's your experience so far? Does the community need to catch up first and do work on custom models, LoRa, and so on to really get thinks cooking? Or do I need to learn better how to work with XL? I was actually looking forward to have a "bland" and hopefully rather unbiased model to work with where not every prompt desperately trys to become a hot anime girl, but I'm struggling to get interesting images for now.
For reference, I updated my A1111 installation with "git pull" (which seems to have worked, as I now have a SDXL tab in my settings) and downloaded the 1.0 model, refiner and VAE from huggingface. I can generate text2imgage in A1111 with the base model, however I can't seem to get the img2img with the refiner model to work ... On ComfyUI I found a premade workflow that uses the base model first and the refiner from the latent and which seems to work just fine technically, but also seems to require a different approach to prompting than I'm used to.
I look them up at lemmyverse.net
I go there about once a week to see if there are new communities I might be interested in. I'm on a selfhosted single-user instance, so my "all" is identical to my "subscribed" and this is how I populate my feed.
Yeah, reducing CFG can help a lot. It sometimes feels to me, that getting a good image is knowing at what point to let loose ...
Hi there, I'm curious to know other people's approach in working with Stable Diffusion. I'm just a hobbyist myself and work on creating images to illustrate the fictional worlds I'm building for fun.
However, I find that getting very specific images (that are still visually pleasing) is really difficult.
So, how do you approach it? Are you trying to "force" your imagined picture out by making use of control net, inpainting and img2img? I find that this approach usually leeds to the exact image composition I'm after but will yield completely ugly pictures. Even after hours of inpainting the best I can get to is "sorta ok'ish", surely far away from "stunning". I played around with control net for dozens of hours already, experimenting with multi-control, weighting, control net only in parts of the image, different starting and ending steps, ... but it's only kinda getting there.
Now, opposed to that, a few prompts can generate really stunning images, but they will usually only vaguely resemble what I had in mind (if it's anything else than a person in a generic pose). Composing an image by only prompts is by no means easier/faster than the more direct approach mentioned above. And I seem to always arrive at a point where the "prompt breaks". Don't know how to describe this, but in my experience when I'm getting too specific in prompting, the resulting image will suddenly become ugly (like architecture that is too closely described in the prompt having all wrong angles suddenly).
So, how to you approach image generation? Do you give a few prompts and see what SD can spit out with that? Taking delight in the unexpected results and explore visual styles more than specific image compositions? Or are you trying to be stubborn like me and want to use it as a tool for illustrating imagination - which at the latter it doesn't seem nearly as good at as at the former.
Hi there, On my router/modem I cannot change the DNS entries, thus just using Adguard/PiHole for DNS blocking ads doesn't work. Would a seperate Router circumvent this problem? Could I set up Adguard (or PiHole) on a Raspberry and use it as a DNS server for my home network?
The plan would be to use my ISP-provided router just as a modem to connect to the internet. Then us a second router to provide my home network, where also Adguard/PiHole can do their thing.
Would this setup work and how would I need to configure it?
When I started I was just copying from online galleries like Civitai or Leonardo.ai, which gave me noticeable better images than what I have came up with myself before. However, it seemed to me that many of these images may also just have copied prompts without understanding what's really going on with them and I started to experiment for myself.
What I will do right now is to build my images "from ground up" starting with super basic prompts like "a house on a lake" and work from there. First adding descriptions to get the image composition right, then work in the style I'm looking for (photography, digital artwork, cartoon, 3D render, ...). Then I will work in enhancers and see what they change. I found that one has to be patient, only change one thing at a time and always do a couple of images (at least a batch of 8) to see if and what the changes are.
So, I still comb though image galleries for inspiration in prompting, but I will now most of the time just pick one keyword or enhancer and see what it does to my own images.
It is a long process that requires many iterations, but I find it really enjoyable.
I just figured out that I could drag any of my images, made with A1111, into the UI and it would set up the corresponding workflow automatically. I was under the impression that this would only work for images already created with ComfyUI first. However, this gives great starting points to work with. I will play around with it tonight and see if I can extract upscaling and control-net workflows with it as a starting point from existing images.
Do happen to have a tutorial for ComfyUI at hand, that you can link and that goes into some details? These custom workflows sound intriguing, but I'm not really sure where to start.
Hi there, I've seen a few videos on yt showing it off and it looks incredibly powerful in finetuning the outputs of SD. It also looks dauntingly complicated to learn how to use it effectively.
For those of you, who played around with it - do you think it gives better results than A1111? Is it indeed better in finetuning? How steep was the learning curve for you?
I'm trying to figure out if I'd want to put in hours to learn how to use it. If it improves my ability to get out exactly the images I want, I'll go for it. If it does what A1111 does, just dressed up differently I'll sit it out :)
please do, I thinking to start making LoRa's as well and the tool looks like it would make the process much easier. Let me know how it goes for you.
I started with the smallest offer available and later upgraded to the second smallest, which now has 4GB RAM. I also have rented additional diskspace, so that I have 30GB now. RAM and CPU are now certainly fine, but I don't know yet about disk space. I read that Lemmy/Mastodon can eat up space quickly and I have currently used up about half of my disk space.
You should be able to configure this differently. Either switch of the confirmation mails completely or use the email credentials from another server.
I use Synapse as Matrix server and Element as client. It doesn't need port 25 (8008 and 8448 are needed in my setup). On Lemmy and Mastodon I configured outgoing mail using smtp via my existing mail hoster, so I don't send mail from my own server. Also, all googling I did said to stay away from selfhosting email, as it is a hassle not to be immediately blocked as a spam mail server ..
I use Synapse as the Matrix server and Element as client on desktop and mobile. It does support video calls, but so far I only tested it for a minute.
I spent a lot of time googling and on youtube, to get a basic understanding for what I was trying to achieve, 2 weeks of after-work time at least. If I should guess 40-50 hours in total. Getting a single piece to work, by following a tutorial can be easy but to get all the things working together was a struggle. Once I had a better grasp on what a reverse proxy is and how docker containers work together in networks, pieces started to fall into place.
I have fail2ban running as well, didn‘t mention it in the op. Also closed all ports beside 80 and 443, which are routed through my NPM proxy. SSH is allowed, but login only with ssh key, no pw authentication.
So far it‘s running well, but I expect things to break when I‘ll need to update parts of it. I have a snapshot from which i can reinstall, but recurring backups need yet to be set up.
Hi there, I was intrigued by the idea of self-hosting my social media accounts, but was more or less a complete noob with all things hosting. However, with the help of the community here (and quite a few hours spent on it) I finally have a working setup! Mastodon, Matrix, Lemmy, Nextcloud all self-hosted behind Nginx Proxy Manager.
Google can find a lot of answers, but sometimes some really specific input is needed - which you guys have provided over the last couple of weeks - so I just wanna say thank you for that!
So, I think I ironed out a lot of things to get my selfhosted setup running, but it seems that Nginx Proxy Manager is causing me troubles. When I restart my server, the container with NPM restarts as expected but I can't log into the web ui (the website comes up, but when I try to log in nothing happens) and it also doesn't provide the expected proxy functionalities. I'm not sure what happens - any advice would be welcome. right now my only workaround is to delete the container and make it from scratch, but this also means making all proxy hosts + certificates from scratch as well ...
I run Nginx with Nginx Proxy Manager web-ui, which makes setting up proxy hosts and handling letsencrypt certificates really easy. I also use Portainer to manage my docker containers. This works well for the stuff I mentioned above (Nextcloud, Matrix, Lemmy mostly)
If I can get Mastodon into the same setup, it'd be neat. I just found a lot of discussion with problems, so I thought I'll ask about it before I spend a few hours in vain :)
I finally managed to selfhost Lemmy and Matrix, now it is time to also get a selfhosted Mastodon instance up. A few questions before I start:
I did some research into the topic and it seems that Mastodon doesn't like to run behind an existing reverse proxy and there are quite a few tweaks necessary to get it running - can someone confirm this? Or is this something easily set up?
I'm currently leaning to run it on a dedicated VPS (due to the issue above and also because it seems to need quite a bit of disk space) - this opens up to do a non-docker installation and follow the official install path. Do you think this will make it easier to keep it updated to new releases in the future?
If going with a docker install there seem to be quite a few problems with updating (at least a lot of threads discussing failed update procedures sprung up when I googles "mastodon docker update") - can someone confirm? Are there easy to follow guides for a docker based update routine?
Right now it seems the easiest would be to run on a dedicated server, follow the native installation procedure and use the templates provided for nginx, certbot, .... thoughts?