10

I want to know how Python knows (if it knows) that a value-type object is already stored in its memory (and also knows where it is).

For this code, when assigning the value 1 for b, how does it know that the value 1 is already in its memory and stores its reference in b?

>>> a = 1
>>> b = 1
>>> a is b
True
  • 1
    Use print(hex(id(b))) to check memory address for b – Yusufsn Apr 19 at 3:01
  • 1
    >>> hex(id(b))'0x7ffe705ee350' >>> hex(id(a)) '0x7ffe705ee350' – Just A Lone Apr 19 at 3:03
  • 2
    If two variables refer to the same value between -5 and 256 (as opposed to use) then by definition there is only one object. – Yusufsn Apr 19 at 3:04
  • 1
    @Yusufsn No. For bigger integers (>256) it's not true. – asn-0184 Apr 19 at 3:09
  • 1
    As I said, only values between -5 and 256 – Yusufsn Apr 19 at 3:11
12

Python (CPython precisely) uses shared small integers to help quick access. Integers range from [-5, 256] already exists in memory, so if you check the address, they are the same. However, for larger integers, it's not true.

a = 100000
b = 100000
a is b # False

Wait, what? If you check the address of the numbers, you'll find something interesting:

a = 1
b = 1
id(a) # 4463034512
id(b) # 4463034512

a = 257
b = 257
id(a) # 4642585200
id(b) # 4642585712

It's called integer cache. You can read more about the integer cache here.

Thanks comments from @KlausD and @user2357112 mentioning, direct access on small integers will be using integer cache, while if you do calculations, though they might equals to a number in range [-5, 256], it's not a cached integer. e.g.

pow(3, 47159012670, 47159012671) is 1 # False
pow(3, 47159012670, 47159012671) == 1 # True

“The current implementation keeps an array of integer objects for all integers between -5 and 256, when you create an int in that range you actually just get back a reference to the existing object.”

Why? Because small integers are more frequently used by loops. Using reference to existing objects instead of creating a new object saves an overhead.

  • 6
    Just to make it clear: this is valid for the CPython interpreter. The language Python does not define this and other interpreters are free to have their own implementation. – Klaus D. Apr 19 at 5:07
  • 2
    Also 10e5 is a float, not an int. (Also, not all small ints come from the small int cache. For example, on current CPython, pow(3, 47159012670, 47159012671) == 1, but pow(3, 47159012670, 47159012671) is not 1.) – user2357112 Apr 19 at 5:37
5

If you take a look at Objects/longobject.c, which implements the int type for CPython, you will see that the numbers between -5 (NSMALLNEGINTS) and 256 (NSMALLPOSINTS - 1) are pre-allocated and cached. This is done to avoid the penalty of allocating multiple unnecessary objects for the most commonly used integers. This works because integers are immutable: you don't need multiple references to represent the same number.

0

Python doesn't know anything until you tell it. So in your code above, when you initialize a and b, you are storing those values(in the register or RAM), and calling the place to store it a and b, so that you can reference them later. If you didn't initialize the variable first, python would just give you an error.

  • 1
    I think you're missing the point of the question. a == b is obviously true. OP is asking why a is b is true. – Mad Physicist Apr 19 at 3:03

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

Not the answer you're looking for? Browse other questions tagged or ask your own question.