• 1 Post
  • 2 Comments
Joined 1 year ago
cake
Cake day: June 11th, 2023

help-circle
  • ofcourse@kbin.socialtoSelfhosted@lemmy.worldSelfhosted LLM (ChatGPT)
    link
    fedilink
    arrow-up
    16
    arrow-down
    1
    ·
    edit-2
    1 year ago

    You can absolutely self host LLMs. HELM team has done an excellent job benchmarking the efficiency of different models for specific tasks so that would be a good place to start. You can balance model performance for your specific task with the model’s efficiency - in most situations, larger models are better performing but use more GPUs or are only available via APIs.

    There are currently 3 different approaches to use AI for a custom task and application -

    1. Train a base LLM from scratch - this is like creating your own GPT-by_autopilot model. This would be the maximum level of control, however the amount of compute, time, and data required for training does not make this an ideal approach for the end user. There are many open source base LLMs already published on HuggingFace that can be used instead.

    2. Fine-tune a base LLM - starting with a base LLM, it can be fine tuned for a certain set of tasks. For example, you can fine tune a model to follow instructions or use as a chatbot. InstructGPT and GPT3.5+ are examples of fine tuned models. This approach allows you to create a model that can understand a specific domain or a set of instructions particularly well as compared to the base LLM. However, any time that training a large model is needed, it will be an expensive approach. If you are starting out, I’ll suggest exploring this as a v2 step for improving your model.

    3. Prompt engineering or indexing using an existing LLM - starting with an existing model, create prompts to achieve your objective. This approach gives you the least control over the model itself, but is the most efficient. I would suggest this as the first approach to try. Langchain is the most widely used tool for prompt engineering and supports using self hosted base- or instruct-LLM. If your task is search and retrieval, an embeddings model is used. In this scenario, you generate embeddings for all your content and store the embeddings as vectors. For a user query, you then convert it to an embedding using the same model, and finally retrieve the most similar content based on vector similarity. Langchain provides this capability, but IMO, sentence-transformers may be a better starting point for a self hosted retrieval application. Without any intention to hijack this post, you can check out my project - synology-photos-nlp-search - as an example of a self hosted retrieval application.

    To learn more, I have found the recent deeplearning.ai short courses to be quite good - they are short, comprehensive, and free.


  • I agree with OP that instances being closed any time is an issue that would need to be resolved fairly soon. A solution in my opinion would be the option to transfer user accounts across instances. This would help with an instance closing and eventually make the fediverse more stable.

    A new user currently has a choice for joining from a number of instances but there is no assurance to ongoing existence for them. Along with that, afaik there is no way to transfer user accounts and data across instances. If a user can transfer their accounts and data, there will be less hesitancy to join a new instance, and user accounts and data can be distributed across more instances. This can also work in such a way that if a subset of user data does not meet the criteria for another instance, then that subset of data is not migrated (most likely a community based data filter).

    Another issue is with the presence of same community/magazine in multiple instances (let’s say tech@lemmy.this and tech@kbin.that) which is frustrating for users since they need to track multiple communities for similar content and the same content is being copied to multiple communities. This should also be resolved by implementing account migration. We are already seeing that communities on certain instances are becoming the prevalent ones. This creates an incentive for the admin of those instances to not shut down. And if they did decide to shut down the instance, then the users can just migrate to another instance and the prevalent community will also get to keep all its data, just in the new instance.