One small correction – the link in the latest version of KerasCV is keras-cv/keras_cv/models/stable_diffusion at master · keras-team/keras-cv · GitHub. The current link points to an older version.
Hi! Submitting my entry here. I named it Stable Collag-ion–using Keras CV StableDiffusion to create collages from diffused images!
Colab notebook here: Google Colab
Hi, here’s my entry: it has pets and ugly Christmas sweaters
The Colab notebook is here
Here’s my submission: implementation of the paper “Prompt-to-Prompt Editing with Cross Attention Control” using KerasCV.
TL;DR: It allows editing the generated content by only manipulating the original prompt!
Other notebooks/Colabs, and examples are in the repo. Check them out!
If you have any questions or suggestions please let me know!
My submission for the first Keras Community Prize is a simple application/workflow to create a mosaic art version of a given image. It uses KerasCV to generate mosaics/tiles to be used to make the output mosaic art image.
The code and other files for the implementation are available from this GitHub repository.
Here’s my submission - notebook
It’s called “Merchy”. It’s an application to get design ideas generated by stable diffusion imprinted on different merchandise like bottles, bags, pens and mugs.
Do let me know if there are any suggestions!
Happy Holidays everyone!!!
This project is a collaborative effort between myself and @deep-diver . Below, we provide a detailed overview of the project. This is our submission.
The original Stable Diffusion model can generate high-quality images. However, the prompt engineering part still remains hard to get specific styles of images. If we want specificity in the generated images, fine-tuning Stable Diffusion can be quite effective. Research shows that for this to work well, only the diffusion model needs to be fine-tuned while the text encoder and image decoder can be kept frozen. This brings us to a question what if we could replace the diffusion model of the currently deployed one while keeping other parts, text encoder and image decoder as is?
To this end, this project covers two subjects: (1) how to fine-tune Stable Diffusion from KerasCV, (2) how to deploy Stable Diffusion in various ways.
We believe that by collating these two subjects, ML Practitioners and Software Engineers will have good coverage of tools needed to put Stable Diffusion in different application use cases.
If you’re looking for more specificity in terms of style, texture, and text-image alignment, fine-tuning can bring in benefits. To demonstrate this, we fine-tuned the diffusion model on a custom dataset while keeping the text and image encoders frozen.
We borrowed ideas from this tutorial by Hugging Face and faithfully reimplemented the code in TensorFlow. Our reimplementation is customizable and supports mixed-precision training along with model checkpointing.
We believe this way, practitioners will be able to repurpose Stable Diffusion for their applications even better.
Check out the stable-diffusion-keras-ft repository for more details.
Stable Diffusion can be deployed in various ways since it primarily consists of three models (encoder/diffusion model/decoder) + some inference-time code. In this project, we cover different deployment scenarios with different platforms and frameworks including Google Kubernetes Engine and Hugging Face Endpoint with FastAPI, TensorFlow Serving, and Hugging Face custom handler. For more information, check out keras-sd-serving repository.
Different applications come with varying needs in terms of serving infrastructure, compute budget, costing, etc. This is why we believe that by decoupling the deployment from these scenarios we can devise our deployment strategies better.
1. All in one endpoint
Simply deploy Stable Diffusion as a whole to a single endpoint. In this case, you have all the pieces of code packed into a package. However, the problem with this approach is that the resources are not utilized optimally since Stable Diffusion runs three different models internally. The text encoder is good to go with CPUs, the decoder requires small-size GPUs, and the diffusion model requires much larger GPUs.
2. Three endpoints
In order to overcome this problem, you can split Stable Diffusion into three endpoints, then the client program interacts with them in a sequential manner. However, there should be time delays during the communication, encoding, decoding, and parsing procedures between clients and servers.
3. One endpoint (original, inpainting, finetuned) with local processing
Instead, you can keep encoder/decoder local while deploying the diffusion model on the cloud with heavy GPUs since that is where lots of computation happens. You could simply mix up local/cloud deployments of each part of Stable Diffusion. This flexibility brings you more benefits by letting you replace only the diffusion part with a more specialized one such as inpainting while keeping the other parts untouched.
Here is our submission: “From cells to planets: A journey with KerasCV’s StableDiffusion”.
We create an illustration which ranges from a cell to large planets. It demonstrates the sheer scale of the objects in our universe! We use the
inpaint() method to achieve this effect.
Replacing objects in a space without segmentation using KerasCV’s StableDiffusion
Here is our submission: “Replacing objects in a space without segmentation using KerasCV’s StableDiffusion”
We modify and use the
generate_image() method to generate an image of an object in a set background. We then use a prompt describing a new object to replace the object in the scene.
This submission is a collaborative effort by Mihir Godbole, Aditya Kane and Parth Dandavate .
I have done my part in this awesome community competition. I hope the organisers can do more like this.
My Idea is to generate music video using lyrics from MusixMatch API.
Hope you all are doing fine for the holidays.
Unfortunately, since I found out about this competition late, this was the first idea that popped up in my mind because my Grandma and aunt wanted something similar long back and this gave me a reason to do it. So, thankyou!
As the name suggests, Morse Code meets Stable Diffusion via various mediums.
Hope you enjoy this.
Everyone, have a great holiday and a positive new year!
Here is our submission:
Diffused Live Weather Cam (24/7) - We use Keras’ Stable Diffusion to generate (realistic) real-time weather pictures of German cities (and towns).
Thanks for organizing this and have a happy new year everyone!
Hi. Here’s my submission Google Colab
Thanks a lot for the awesome entries, everyone! I’m counting 12 entries in total. Submissions are now closed. We will get to judging and awarding the prizes shortly after the Keras team regroups after the holidays.
Any update regarding prizes?
The Keras team completed the prize judging process. We had a lot of fun going through the entries and discussing them. We were really impressed with the creativity of the participants and the effort they put into their entries. We extend our warmest thanks to all participants!
Here are the results:
Winner (5k): Prompt to Prompt Editing by @Miguel_Calado
The ability to do text-based image editing is definitely one of the most useful applications of generative image models. This project is an implementation of the paper “Prompt-to-Prompt Image Editing with Cross Attention Control”. The code quality is outstanding and the generation results are excellent. The project is also very well presented, with extensive explanations and clear code examples. We really enjoyed reviewing this one!
Winner (2k): Fine-tuning Stable Diffusion by @Sayak_Paul and @deep-diver
For those with enough GPU cycles (and memory!) to pull it off, fine-tuning Stable Diffusion on your own dataset is one of the most practically useful things you can do with the model. The code quality in this project is excellent and the generation quality is quite decent (and would presumably improve with further training). A lot of work clearly went into this project and we expect it to be broadly useful for the generative Keras ecosystem!
Winner (2k): Weather live cam by @avocardio
This project lets you generate a “live” view of the current weather in any German city. We could see it being practically useful as a way to generate an expressive background image for a weather app – “here’s what the current weather looks like in your city” – without requiring an actual livecam. It’s a clever and very original idea, the contents are very well presented, with clear explanations and visuals and excellent code quality. Great example of sophisticated integration between different simple components to create something cool!
Honorable mention: Text to 3D Point Cloud by @Jobayer
This project enables you to go from a text prompt to a 3D point cloud of the corresponding object, by pipelining together Stable Diffusion and an image-to-pointcloud model. It’s a highly original idea that we see as being potentially very useful for 3D asset prototyping. The project has very high quality code and is very readable. (We recommend adding more text explanations and comments (in particular an introduction to present the use case and the method used) to improve discoverability and readability.)
Honorable mention: Morse diffusion by @Heman_B
This project enables you to dictate a generative prompt in Morse code via head movements, which can be leveraged with people who are both unable to use a keyboard and to use their voice due to motor impairments. It’s a very original project that we found to be well executed and well presented. It uses the Mediapipe Facemesh model to go from a webcam feed to a Morse string, then feeds it into Stable Diffusion. A lot of effort clearly went into this idea, and the code was pleasant to read.
Congrats to the winners!
We will be in touch with the three winners. Again, thanks a lot to everyone for participating, we really enjoyed reading through all of the entries!
Thank you so much. It’s an honor to be alongside so many amazing creators. The world is a better place with y’all.
Thanks a lot @fchollet and team for the challenge. It was a great learning experience for us and we thoroughly enjoyed the same.
A lot of cool projects, thanks everyone! I really hope this format is repeated again some time, it was a lot of fun.
Hi everyone, I’m completely stunned that I actually won, especially after looking at all the awesome and high-quality projects that everyone submitted. It is an honor and a privilege.
The “Prompt-to-Prompt” paper is one of those that has a plain simple idea with beautiful results and that taunted me to reproduce it. I recommend anyone to read it.
Hopefully, this implementation makes justice to such an elegant idea.
Thank you to the Keras team for organizing this competition and to everyone that participated - I’m sure we will be learning tons with each project.
Hopefully, more of these initiatives gets repeated sometime soon!