Giter Site home page Giter Site logo

Comments (9)

NielsRogge avatar NielsRogge commented on August 17, 2024 1

Hi @edemir206,

here's the code one can use for that:

import vertexai
from vertexai.generative_models import GenerativeModel, Part

# variables for you to fill in
PROJECT_ID = ""
LOCATION = ""
BUCKET_NAME = ""

# initialize Vertex AI
vertexai.init(project=PROJECT_ID, location=LOCATION)
        
model = GenerativeModel("gemini-1.5-pro-preview-0409")

# get the GCS path of the file
gcs_path = f"gs://{BUCKET_NAME}/{filename}"

prompt = "Describe the file"

response = model.generate_content(
            [
                Part.from_uri(gcs_path, mime_type="application/pdf"),
                prompt
            ]
)

The code can be run if you're authenticated to your Google Cloud project (by running gcloud auth login).

from cookbook.

NielsRogge avatar NielsRogge commented on August 17, 2024

Looks like this is going to be addressed in #17 as the File API is still in preview

from cookbook.

NielsRogge avatar NielsRogge commented on August 17, 2024

Another thing that is confusing is that some guides are doing this:

!pip install -U -q google.generativeai

import google.generativeai as genai

model = genai.GenerativeModel('models/gemini-pro')

whereas other guides are using this:

!pip install --U -q google-cloud-aiplatform

from vertexai.generative_models import GenerativeModel

multimodal_model = GenerativeModel("gemini-1.0-pro-vision")

Which one is recommended? Why are there 2 Python SDKs?

from cookbook.

markmcd avatar markmcd commented on August 17, 2024

There are 2 SDKs because there are 2 platforms that host the Gemini API. One for Google Cloud Platform customers (minimal setup when running in Google Cloud), and one that does not require a GCP account (API key auth).

Platform Docs SDK
Gemini API https://ai.google.dev/ google-generativeai
Vertex AI https://cloud.google.com/vertex-ai/docs/generative-ai/learn/overview google-cloud-aiplatform

This guide is meant to help disambiguate the two. We try to keep the API surfaces aligned, but some parts are different by design - e.g. authentication.

If you want to leave feedback for the docs you linked, you can use the "Send feedback" links on the respective pages.

from cookbook.

NielsRogge avatar NielsRogge commented on August 17, 2024

Ok, I've been trying out setting up an API key by following the authentication notebook.

However, Google AI Studio is not available in my region (Belgium). Does it mean I cannot use the Gemini API?

from cookbook.

edemir206 avatar edemir206 commented on August 17, 2024

@NielsRogge did find how to send PDF files to gemini api ? I'm struggling with this too, I want to send a PDF file via api for Gemini to summarize. Is it possible @markmcd ?

from cookbook.

edemir206 avatar edemir206 commented on August 17, 2024

Hi @edemir206,

here's the code one can use for that:

import vertexai
from vertexai.generative_models import GenerativeModel, Part

# variables for you to fill in
PROJECT_ID = ""
LOCATION = ""
BUCKET_NAME = ""

# initialize Vertex AI
vertexai.init(project=PROJECT_ID, location=LOCATION)
        
model = GenerativeModel("gemini-1.5-pro-preview-0409")

# get the GCS path of the file
gcs_path = f"gs://{BUCKET_NAME}/{filename}"

prompt = "Describe the file"

response = model.generate_content(
            [
                Part.from_uri(gcs_path, mime_type="application/pdf"),
                prompt
            ]
)

The code can be run if you're authenticated to your Google Cloud project (by running gcloud auth login).

I'm using Google Gemini API not Vertex API, i'm still a little bit confused but I think for my needs Gemini Api Pricing is much lower.

I tried using inlineData for PDF embed with base64 encoded PDF but I get error 500, my json is formatted like:

{
  "contents": [
    {
      "role": "user",
      "parts": [
        {
          "text": "Please summarize"
        },
        {
          "inlineData": {
            "mimeType": "application/pdf",
            "data": base64EncodedfileData
          }
        }
      ]
    }
  ],
  "generationConfig": {
    "temperature": 0.9,
    "topK": 1,
    "topP": 1,
    "maxOutputTokens": 2048,
    "stopSequences": []
  },
  "safetySettings": [
    {
      "category": "HARM_CATEGORY_HARASSMENT",
      "threshold": "BLOCK_MEDIUM_AND_ABOVE"
    },
    {
      "category": "HARM_CATEGORY_HATE_SPEECH",
      "threshold": "BLOCK_MEDIUM_AND_ABOVE"
    },
    {
      "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
      "threshold": "BLOCK_MEDIUM_AND_ABOVE"
    },
    {
      "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
      "threshold": "BLOCK_MEDIUM_AND_ABOVE"
    }
  ]
}

Is it not possible via Gemini API ?

from cookbook.

NielsRogge avatar NielsRogge commented on August 17, 2024

When you use Gemini's API you can follow the guide here: https://ai.google.dev/gemini-api/docs/prompting_with_media

from cookbook.

edemir206 avatar edemir206 commented on August 17, 2024

When you use Gemini's API you can follow the guide here: https://ai.google.dev/gemini-api/docs/prompting_with_media

I could upload the file using the sample but it fails with the error "google.api_core.exceptions.InvalidArgument: 400 Unsupported MIME type: application/pdf"

Here's my code:

import google.generativeai as genai
from IPython.display import Markdown

GOOGLE_API_KEY=""

genai.configure(api_key=GOOGLE_API_KEY)

sample_file = genai.upload_file(path="/home/user/python/test.pdf",
                                display_name="Sample PDF")

print(f"Uploaded file '{sample_file.display_name}' as: {sample_file.uri}")

model = genai.GenerativeModel(model_name="models/gemini-1.5-pro-latest")

response = model.generate_content(["Please Summarize PDF.", sample_file])

Markdown(">" + response.text)

genai.delete_file(sample_file.name)
print(f'Deleted {sample_file.display_name}.')

from cookbook.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.